The magazine of record for the embedded computing industry
May 2013
www.rtcmagazine.com
Switched Fabrics Offer a Wealth of Strategies Parallel Code Generator Wrings Performance out of Multicore Manage Power for Peak Performance An RTC Group Publication
RUGGED, POWERFUL
COM EXPRESS Intel® Core™ i7 processor Basic COM Express
Freescale QorIQ P2041 Mini COM Express
Freescale QorIQ P2020 Compact COM Express
COM Express modules from X-ES Our family of fully ruggedized COM Express modules support the latest high-performance Freescale QorIQ and Intel® Core™ i7 processors and include soldered down memory with ECC, additional mounting holes, and Class III PCB fabrication and assembly. When you choose X-ES COM Express modules, you are supported with excellent development platforms and innovative rapid-deployment systems. Contact us today to learn more.
Highest performance under any condition. That’s Extreme.
Extreme Engineering Solutions 608.833.1155 www.xes-inc.com
COM
Takes on the Rugged World Cover: COM modules from MEN Micro that are built to the VITA 59 Ruggedized System-On-Module Express (RSE) specification. Such modules are finding increased adoption in rugged application environments.
38 Browser-Based Unit Brings Scalable HMIs to PCs, Smartphones, Tablets and Other Mobile Devices
40 Desktop Expansion Enclosure Connects with Either Thunderbolt or PCIe
TABLEOF CONTENTS
43 Extremely Rugged CPU Module for Highest Reliability in Harsh Environments
VOLUME 22, ISSUE 5
Departments
5Editorial Big Data and Big Ideas: Intelligent Systems, M2M and the Internet of Things
Technology in Context
TECHNOLOGY IN SYSTEMS
COM for Rugged Environments
Data Acquisition with Small Modules
Light and Rugged: 14 Small, Qseven Enables New Battlefield Technologies Dan Demers, congatec and Michele Kasza, Connect Tech
Insider Reliable and Real: COMs 6Industry Latest Developments in the Embedded Broaden Their Reach 18 Rugged, Marketplace Form Factor Forum 8Small The Future of Flash TECHNOLOGY CONNECTED Products & Technology Strategies for Fabrics Embedded Technology Used by 38Newest Express Fabrics Break Industry Leaders 24 PCI Through I/O Limitations Barbara Schmitz, MEN Mikro Elektronik
EDITOR’S REPORT
Vincent Chuffart, Kontron
Development Tools Speed Delivery
10
Accelerating Build and Test Leads to Faster Software Deployment—A Really Big Deal with Android Tom Williams
/ AMC Solutions for 30MicroTCA Real-Time Data Acquisition Rodger Hosking, Pentek, Inc.
TECHNOLOGY DEPLOYED Wringing Performance out of Multicore An Elegant and Efficient Approach to Exploiting 34SequenceL: the Power of Parallelism
Doug Norton, Texas Multicore Technologies; Larry A. Lambe, Multidisciplinary Software Systems Research; and Richard Luczak, RL Aerodynamics
Industry watch Power Consumption vs. Performance Performance through Power Management 44Improving and Workload Consolidation in Telecommunications Li Jun, Adlink Technology
Digital Subscriptions Available at http://rtcmagazine.com/home/subscribe.php RTC MAGAZINE MAY 2013
3
MAY 2013 Publisher PRESIDENT John Reardon, johnr@rtcgroup.com
ATCA, μTCA, VME AND CPCI SYSTEMS... FASTER.
Editorial EDITOR-IN-CHIEF Tom Williams, tomw@rtcgroup.com SENIOR EDITOR Clarence Peckham, clarencep@rtcgroup.com CONTRIBUTING EDITORS Colin McCracken and Paul Rosenfeld MANAGING EDITOR/ASSOCIATE PUBLISHER Sandra Sillion, sandras@rtcgroup.com COPY EDITOR Rochelle Cohn
Schroff® Systems and Subracks EXPRESS provide VITA and PICMG compliant product solutions faster and at a competitive price. Protect your application with standard or customized electro-mechanical and system products – shipped in as few as two weeks and backed by our global network and more than 60 years of engineering experience. See our complete offering online.
Art/Production ART DIRECTOR Kirsten Wyatt, kirstenw@rtcgroup.com GRAPHIC DESIGNER Michael Farina, michaelf@rtcgroup.com LEAD WEB DEVELOPER Justin Herter, justinh@rtcgroup.com
RAPID DELIVERY VITA and PICMG compliant solutions.
W W W.SCHROFF.US
Untitled-2 1
4/29/13 9:59 AM
Advertising/Web Advertising WESTERN REGIONAL ADVERTISING MANAGER Stacy Mannik, stacym@rtcgroup.com (949) 226-2024 MIDWEST REGIONAL AND INTERNATIONAL ADVERTISING MANAGER Mark Dunaway, markd@rtcgroup.com (949) 226-2023 EASTERN REGIONAL ADVERTISING MANAGER Shandi Ricciotti, shandir@rtcgroup.com (949) 573-7660
Billing Cindy Muir, cmuir@rtcgroup.com (949) 226-2021
EMBEDDED SOLUTIONS R-SERIES APU
MSC Embedded Inc. Tel. +1 650 616 4068 info@mscembedded.com www.mscembedded.com
COM Express™ - MSC C6C-A7 Ultimate graphics and video performance The MSC C6C-A7 module is based on AMD‘s Embedded R-Series platform delivering high-performance processing coupled with a premium high definition visual experience in a power efficient solution. Supporting OpenCL™ it can boost the computing performance using the graphics engines for parallel processing.
AMD Embedded R-Series Accelerated Processing Units R-460L quad-core, 2.0/2.8 GHz R-452L quad-core, 1.6/2.4 GHz R-260H dual-core, 2.1/2.6 GHz R-252F dual-core, 1.7/2.3 GHz
V-3_2013-WOEI-6376
4
Untitled-4 1
MAY 2013 RTC MAGAZINE
AMD Radeon™ HD 7000G Series graphics Up to 16 GB DDR3 SDRAM MicroSD card socket, bootable Three DisplayPort/HDMI/DVI interfaces VGA and LVDS/Emb. DisplayPort Four independent displays supported DirectX 11, OpenGL 4.2, OpenCL 1.1
3/1/13 12:17 PM
To Contact RTC magazine: HOME OFFICE The RTC Group, 905 Calle Amanecer, Suite 250, San Clemente, CA 92673 Phone: (949) 226-2000 Fax: (949) 226-2050, www.rtcgroup.com Editorial Office Tom Williams, Editor-in-Chief 1669 Nelson Road, No. 2, Scotts Valley, CA 95066 Phone: (831) 335-1509
Published by The RTC Group Copyright 2013, The RTC Group. Printed in the United States. All rights reserved. All related graphics are trademarks of The RTC Group. All other brand and product names are the property of their holders.
EDITORIAL MAY 2013
Tom Williams Editor-in-Chief
Big Data and Big Ideas: Intelligent Systems, M2M and the Internet of Things
I
t is sometimes fascinating to watch how colloquial expressions arise, are used, drop out of use, and are often used with only an assumption regarding what they actually mean and, more importantly, what they may imply about the future. Some of the expressions flying around today include “intelligent systems,” the “Internet of Things” and “machine-to-machine systems.” Now, I am not saying that these terms are misunderstood. It’s more that they are very broadly applicable, and as we drill down to more specific examples and implementations, a definition, or sub-definition, comes into sharper focus. Take machine-to-machine systems for example. If we just say “machine-to-machine,” we might as well just refer to the Internet of Things, because it tends to refer to the entire universe of autonomous communicating gizmos. When we add the word “systems” to the M2M, we can get more specific. Here we are talking about a subset of the IoT under which the communicating gizmos share a specific purpose such as a transportation management and control system. Here they are sharing GPS data, data about freight content and destination, vehicle maintenance data and more. Very often, data is passed among elements of such a system as needed and without human intervention, such as when freight is loaded and unloaded causing automatic routing changes. Maintenance data may be uploaded and stored for later use, or specific bits of data may trigger alarms or cause some preset activity to be initiated. In any event, that data, and the control of the given systems, is available to a human operator at some remote locations. Parts of it may also be available to the driver. Such an M2M system can be thought of as a microcosm of the larger world of Intelligent Systems, which encompasses the IoT, the Internet, the Cloud and pretty much everything else. What both demonstrate, however, is that data is made available and exchanged among all kinds of devices and entities, and much of it is gathered and transmitted automatically and autonomously. How it is used is subject to the needs of a given application and the creativity of the consumers of that data. In the case of the M2M systems, those needs are oriented around transportation but the uses of the data can grow to encompass all kinds of corporate analysis that might not have been envisioned when the system was originally set up. One intriguing example comes from the use of the on board
diagnostic (OBD) port that is part of every automobile designed since 1996. Originally designed for maintenance purposes, the OBD port interacts with the car’s computer and can supply a wide variety of diagnostic data. You can even buy a $50 device from Amazon to read codes when the “check engine” light comes on to avoid taking the car into the shop for a loose gas cap. But there are more creative uses for the data supplied by the OBD port. At least one insurance company now offers discounts based on data gathered from the OBD port that can be used for purposes other than maintenance. The promo material says it “automatically keeps track of your good driving.” Of course, it also keeps track of your bad driving. It collects time and speed data and supplies the number of miles driven, the time they were driven and how often and how hard the driver brakes. This data is made available to the insurance company to establish a profile of driving behavior that can be used to establish insurance rates. No doubt, somewhere in the innards of the insurance company are also IT applications that collect these individual results along with data gathered from other sources and use them for all sorts of analytic purposes that serve the company’s business model. Since using the device is voluntary and there are definite assurances of privacy, there is little to worry about in this case. but it is instructive of how data generated by devices for an original purpose can flow into Big Data for completely different uses. We can be left to wonder how long it may be before law enforcement gets the idea that such driving data, perhaps also correlated with GPS data, would be useful for their purposes. Imagine the day when you have to have a sum of money on deposit with the DMV and amounts would be deducted for violations transmitted from your car. If the amount is depleted and not replaced within a specific time, a code could be sent to shut down the car. This is all possible today. Now that’s just my paranoid idea for a creative use of Big Data, but you see where such situations can lead. We are now in the world of Intelligent Systems, which is also a continuum from small to large, micro to macro. It now seamlessly joins the smallest devices with the largest systems and the consumer world with the industrial world. Learning to navigate this world effectively, productively and safely will be an ongoing adventure. RTC MAGAZINE MAY 2013
5
INDUSTRY
INSIDER MAY 2013 SGET Certifies Computer-on-Module Standard for ARM/SoC Processors SGET, the Standardization Group for Embedded Technologies, has passed its first standard. The new global standard under the brand name “SMARC”—for Smart Mobility ARChitecture for ARM-based SoC modules—is based on ULP-COM, the term which up to now was used for Ultra Low Power Computer-on-Modules. Along with Kontron and Adlink, numerous other SGET members were involved in the definition of the new standard. “The Standard Development Team SDT.01 got together about two months ago and managed to adapt and pass the specification in this short space of time,” Engelbert Hörmannsdorfer, chairman of the SGET, explains. “This demonstrates on the one hand how well and how fast the SGET can work together with industry members. On the other hand, it shows just how effective the set up of SGET is.” A number of companies in the embedded technologies industry have recognized the need for a dedicated ARM COM standard, due to the fact that the different interfaces in ARM and x86 often require specific implementation. New chipset interfaces need a future-proof pin-out. Many more providers of embedded technologies are of the same opinion and have, over the past few weeks, announced their membership in the SGET and their collaboration with the Standard Development Team SDT.01. “Fast standardization processes don’t work with long, drawn-out procedures,” Hörmannsdorfer goes on to comment. “These days, time-to-market is the deciding factor, and this is where the SGET wants to support the embedded technologies branch.” The SMARC specifications will be freely available for download on the SGET website according to the SGET terms of membership—both for SMARC developers as well as for users and carrier providers. First products with different SoC manufacturers of ARM processors (Nvidia T3, Freescale i.MX6, TI Sitara) based on the SMARC standard have already become available. Other companies in the embedded computing industry are invited to join the SGET and contribute their ideas. Apart from embedded computing manufacturers on board and system levels, chip and connector manufacturers, research and educational institutions as well as embedded system integrators, OEM solution providers and industrial users are most welcome.
Express Logic and Twin Oaks Integrate CoreDX DDS Communications Middleware with ThreadX RTOS
Express Logic has announced that its ThreadX RTOS now supports the CoreDX Data Distribution Service (DSS) communications middleware from Twin Oaks Computing. CoreDX DDS simplifies communication processes among different system types, easing the communications challenge between the embedded mobile device and the enterprise-, IT-based network application. The integration of ThreadX RTOS and CoreDX DDS strongly complements ThreadX’s broad deployment in wireless and mobile devices such as routers, PDAs, Android phones and portable medical devices.
6
MAY 2013 RTC MAGAZINE
CoreDX DDS implements a simple, efficient and universal way to coordinate communications. CoreDX DDS is a crosslanguage, cross-operating system, cross-platform middleware (or IPC) solution that manages communications not only between networks and devices, but also in providing an essential service for interprocessor communications in multicore or multiprocessor systems. CoreDX DDS simplifies communication processes, making distributed development easier, faster and more reliable. Express Logic’s ThreadX RTOS offers a robust library of application-callable operating system services that simplify and optimize the performance of embedded systems. Designed
for such microcontroller-based applications, ThreadX features a memory footprint as small as 2 Kbytes, which enables it to reside in even the most limited on-chip MCU memory. ThreadX provides preemptive, real-time, prioritybased scheduling for optimum responsiveness and high performance, and includes services such as thread scheduling, message passing, resource allocation, synchronization and interrupt management.
Curtiss-Wright Controls and CoreAVI Announce Partnership
Curtiss-Wright Controls Defense Solutions (CWCDS) has partnered with Channel One
and its subsidiary Core Avionics & Industrial (CoreAVI) to provide the defense and aerospace embedded COTS market with a 15-year lifetime support guarantee for the XMC-715 graphics controller and future generations of AMD Radeon-based embedded Graphics Processing Units (GPUs). This partnership builds on the companies’ existing successful relationship for support on the AMD Radeon M9 GPU that enables CWCDS to continue to provide PMC-704 and PMC706 graphics controllers for many more years to come. Under the agreement, CoreAVI, a closely aligned Value Added Reseller (VAR) of AMD components, will provide and support AMD’s Radeon E4690 GPU for use on Curtiss-Wright’s XMC-715 graphics controller XMC (VITA 42) mezzanine card as well as for future generations of AMD components for use on Curtiss Wright high-performance graphics controllers. Support will be provided under CoreAVI’s Program Ready Components program. The program provides a high value service through proven capabilities for long-term storage, extended temperature screening and testing with traceability to the original AMD source of supply and is independent of AMD’s production schedule. In addition, CoreAVI will provide Curtiss-Wright with high-performance embedded OpenGL ES and OpenGL SC graphics drivers integrated with Curtiss Wright graphics controllers and Single Board Computers (SBCs) supporting Wind River VxWorks and Green Hills Integrity among other RTOSs. Safetycritical requirements will be supported by CoreAVI with certification evidence kits and support for DO-178B/C / ED-12B/C Level A and DO-254 required for regulatory approval.
Apple and Android Account for 82% of All App Downloads, But More OS Choices Could Roil Market
According to a recent report by research2guidance, Apple’s market share of app downloads has fallen from 81% in 2008 to 39% at the end of 2012, while Android’s app downloads have increased year over year reaching 42% at the end of 2012. The result is that both platforms now comprise 82% of all app downloads. But the report raises the question of whether this duopoly will last for the next few years or if the market may see a major change in structure. The most likely scenario is that the duopoly will fracture into a more heterogeneous mobile operating system landscape. There is evidence already in 2013 that the market will enter a new phase with more relevant mobile app platforms, which would be the beginning of the end of the duopoly. New platforms and choices are starting to appear in the form of Microsoft, BlackBerry, Firefox, Ubuntu or Jolla that will challenge the two market leaders. Some have financial muscles, others have cool new concepts and solutions. All seem to attract a lot of market hype indicating that the market got a bit bored with iOS and Android and are looking for something new. Part of the source for what may look disruptive is that today, 90% of all smartphones come with an iOS or Android operating system. Maybe mobile phone users can live with this uniformity, but most of the smartphone device manufacturers can’t, at least if they don’t want to compete only by price. As a consequence, new and fresh operating systems will be loaded on smartphones especially from second tier device manufacturers like HTC, Huawai and ZTE to offer something different.
This market shift would have an impact on the app initiatives companies have launched over the course of the last few years. Every app developer and publisher will have to adjust their app development and distribution strategy to compete in a market with 5-7 relevant mobile operating systems. In essence, app development and distribution will become more complex. At the end of Q2 2013, research2guidance will release the 3rd edition of the “MultiPlatform App Development Solution Report.”
Yocto Project Compatible Carrier Grade Linux
Wind River has introduced the Wind River Linux Carrier Grade (CG) Profile for the latest version of Wind River Linux. Formally registered for the CGL 5.0 specification with the Linux Foundation, the profile is the first delivery of Carrier Grade Linux functionalities on top of a Yocto Project Compatible product. With Wind River Linux as a base, the Linux CG Profile gives customers a turnkey platform that allows them to meet their CGL requirements. Additional profiles will continue to be developed to address a variety of market specific needs. The mix and match nature of these profiles for Wind River Linux offers developers flexibility and choice to meet a diverse range of specialized needs. Carrier Grade Linux registration requires several key requirements to be met, including compliance to standards, support for highly available hardware, serviceability, performance, high availability, clustering and security. Carriergrade products typically require up to five nines or six nines (99.999 to 99.9999 percent) availability, translating to downtime as low as 30 seconds a year. Additionally, given rising network traffic growth and the associated need for greater security of this data, the CGL re-
quirements designed to help make systems more reliable and resistant to attacks become even more significant. Carrier grade is a hard requirement for networking devices, but it can also apply to large corporate infrastructures, data centers and highly mobile devices. “By providing Yocto Project-based Carrier Grade Linux, Wind River is furthering crossarchitecture support and helping developers who must deliver carrier-grade requirements to overcome a myriad of complicated development challenges and meet tight deadlines,” said Amanda McPherson, vice president of marketing and developer programs at The Linux Foundation.
Silicon Labs Achieves ZigBee Golden Unit Certification
Silicon Labs has announced that its Ember ZigBee solutions— silicon devices, software and development tools—have achieved Golden Unit certification from the ZigBee Alliance for the newly released ZigBee IP specification. ZigBee IP is the first open standard for IPv6-based wireless mesh networking solutions, providing seamless, end-to-end Internet connectivity and a scalable architecture to control lowpower devices. The new ZigBee IP specification adds network and security layers and an application framework to the IEEE 802.15.4 standard. It supports cost-effective, energy-efficient wireless mesh networks based on standard Internet protocols such as IPv6, 6LoWPAN, PANA, RPL, TCP, TLS and UDP. Ultimately, ZigBee IP will provide a standards-based foundation for Internet of Things (IoT) applications ranging from smart meters for the smart grid to in-home energy management systems to wireless sensor networks. As one of the first to be certified by the ZigBee Alliance,
Silicon Labs’ Ember ZigBee solutions, including EM35x wireless system-on-chip (SoC) devices, Ember ZigBee IP networking software and development tools, will serve as a development platform for building and testing future connected products based on the ZigBee IP specification. The Golden Unit certification process instills confidence among developers and end users that all connected device products for the IoT from different vendors will interoperate seamlessly.
GrammaTech Helps EEMBC Ensure Robustness of Processor Benchmarks
GrammaTech has announced that the Embedded Microprocessor Benchmark Consortium (EEMBC) has adopted GrammaTech’s static analysis tool, CodeSonar, to analyze, review and test the robustness of code-based benchmark suites. The first EEMBC benchmark examined with CodeSonar is the upcoming FPMark, a comprehensive suite of Floating-Point (FP) benchmarks designed to provide a standardized, industrywide accepted measure for floating-point performance. FPMark is used to evaluate the capabilities of embedded processors in a wide variety of embedded applications, including audio, automotive and motor control. The EEMBC workgroup’s goal is to create benchmarks that will expose and highlight the performance gains from innovations in terms of real application performance. “As an industry standard organization, we have little tolerance for faulty benchmark code because so many people use and rely on it,” said Markus Levy, president of EEMBC. “GrammaTech’s CodeSonar was used to avoid potential defects in the FPMark code to ensure a reliable benchmark certification suite.” RTC MAGAZINE MAY 2013
7
SMALL FORM FACTOR
FORUM Colin McCracken
The Future of Flash
W
hat does a small form factor embedded system have in common with a smartphone, tablet, reader, Ultrabook, or even a data center? Not much. Or as little as possible, one might hope. These days, however, data and code storage technologies—namely, flash memory—are common across most of these device types thanks to the rapid flash advancements and commoditization over several years. NAND flash has replaced NOR flash for most purposes, driven by cost deltas and bi-directional read/write usage far overshadowing read speed. Multi-level/layer cell technology, known as MLC, improves densities and costs with two (or more) bits per cell. Although shrinking geometries tend to challenge the write and erase times for floating gates, the industry chugs on unabated. Some OEMs favor single-level/layer cell technology (SLC) due to reliability concerns, but again the industry is very resilient and will improve designs where there is demand. Even if flash for consumers is deemed inadequate for the most discriminating embedded apps, there are suppliers who can go the extra mile. What exactly does “embedded flash” mean? Historically, various disk-on-module solutions plugged into memory sockets on a board, or plugged into PCMCIA or CompactFlash connectors. These days, there are all sorts of standards with various parallel and serial interfaces as well, from SD/MMC to USB flash to 2.5” SATA SSDs that mechanically mount as replacements for rotating disk drives. Although most of these are known as consumer or enterprise devices, they are useful in embedded systems—for file transfers and booting OS images in development all the way to deployment as part of the systems. While data center solutions range from stand-alone boxes with Ethernet connectors to large slot cards with PCIe interfaces that plug into standard backplanes for any rackmount chassis, embedded modules are chip-level or board-level in nature, optimized for size, weight, power and cost. Recently, single-chip ICs are even soldered on tiny CPU modules, although clearly these aren’t pluggable or removable. On the flip side, they are as rugged as soldered RAM vis-a-vis an SODIMM socketed RAM module. The chips have capacities that can hold even a large OS, application and Linux or Windows file system.
8
MAY 2013 RTC MAGAZINE
“Embedded flash” can also refer to customized hardware for applications that require very high security and data protection. Virtually any system connected to a network can be hacked, if not easily, at least theoretically by people intimately knowledgeable with the flash type being used. Flash modules usually come with a controller that is the unsung hero of the module. Inherent limitations of flash such as read/erase cycles and bad regions that cause bit errors are overcome by sophisticated wear-leveling algorithms that re-locate memory regions having data write traffic. Controllers even compress and decompress data on the fly. Each controller is optimized for a certain interface (hardware and protocol), whether serial / differential pair like USB 3.0 or SATA II, or slower and lower-power-consumption parallel SD card interface. “Industrial flash” can mean a number of things at the module level, from wide temperature range or long lifecycle to SLC flash or even all three. In some ways, MLC and long lifecycle are pitted against each other. But you can expect to pay more, and rightfully so, for any of these industrial benefits since they go against the cost-focused grain of consumer media. Where is it all going? The consumer and enterprise segments are on well-established trajectories. But the embedded market fosters additional degrees of freedom, which manufacturers can target for their various specializations. At the lowest level, firmware gurus can serve up nearly unbreakable security or wear-leveling that is optimized for even just one specific system. Some NRE money for the custom firmware tweaking is appropriate here. At the module level, we’ve only seen the tip of the iceberg in terms of new standards, since the fast bus interfaces are driving toward special signal integrity connector technology that hasn’t been used for solid state storage before. Small form factor modules and SBCs are hard-pressed for available connector space as it is, so new standards may need to combine other unrelated functionality to earn a seat at the SBC and COM (computer-on-module) table. This will take years to play out. No matter which way flash evolves for consumer and embedded markets, together and differently, flash is here to stay and we will only become more addicted to it.
ploration your goal k directly age, the source. ology, d products
editor’s report Development Tools Speed Delivery
Accelerating Build and Test Leads to Faster Software Deployment—A Really Big Deal with Android As software development moves toward the agile approach, the need to build test and use software under development becomes more critical. Shortening this cycle is essential to meeting market pressures, a need made even more intense by the rapid pace in the world of Android. by Tom Williams, Editor-in-Chief
B
efore you can release a software persed, working on different aspects product, you have to test it; you of a project. This development process, have to verify it through quality of course, has its own management assurance, you have to configure it for requirements as programmers are asspecific physical environments; you have signed tasks, write and debug code. At nies providing solutions it. nowAnd before you can do any to package some point, the developers check in ion into products, technologies and you companies. yourit, goali.e., is to research latest of these things, have Whether to build theirtherespective assignments and what ation Engineer, or jump to a company's technical page, the goal of Get Connected is to put you compile it and assemble its parts into a could be termed the software delivery you require for whatever type of technology, whole. process takes place. That process conand productsmeaningful you are searching for. Of course, during the development process, the build/test cycle sists of three major aspects, which are gets run many times, so finding some way build, test and deliver. All three of these to accelerate that process becomes a vi- stages involve what is often a very comtal consideration. This is especially true plex workflow. The ability to organize, with the ever-shortening time-to-market manage and automate that workflow can demands and the increasing complexity of go a long way toward shortening this cycle time, especially when that includes software products. A large software development making the generated data automatioperation can consist of hundreds of cally available to those tasks that need developers, often geographically dis- it as well as to developers for use in appropriate reports. One company that has been adGet Connected dressing this particular need is Electric with companies mentioned in this article. Cloud, which has developed a suite of www.rtcmagazine.com/getconnected
End of Article
10
MAY 2013 RTC MAGAZINE
Get Connected with companies mentioned in this article.
solutions for general software development and has announced a solution specifically for use with the Android operating system. According to Electric Cloud’s Marketing VP, Kalyan Ramanathan, the growing adoption of agile software development methods had made the need to shorten this build/test/ deliver cycle even more compelling. Agile development involves teamwork, collaboration and adaptability and is incremental. Increments known as timeboxes can be as short as one to four weeks and involve coding and testing, and at the end of each iteration the presentation of the results to stakeholders. While that may not be the final release, the progression does involve a form of delivery. Ramanathan says, “The increase in agile software development means you need to deliver the software at the same pace you are developing the software. You’re not just developing code, but also getting it into the hands of a user for feedback.” The overall workflow management tool is called ElectricCommander, which automates and accelerates software delivery with an “orchestration engine” that brings together production processes and their supporting compute resources. The tool does not dictate the tools or processes themselves but includes them in an automated environment to achieve faster cycle time and more efficient use of resources. For example, in the testing cycle, ElectricCommander can assign groups of tests to run in parallel across an organization’s server farm or, in the case of smaller customers, in a cloud environment such as is provided by Amazon. It also provides a workflow visualization dashboard, which presents a view of the overall process to provide the current job status within each process and the workflow history. In addition, it stores all build, test and deploy information for reporting and statistical analysis. For example, once the suite of tests has been run, success and failure information can be made available to the
editor’s report
Test (CTS) Optimization
Up to 70% test time acceleration Gather input
Check test execution flags Check test execution flags Start Bionic tests
Check test execution flags
Run Bionic tests
Start Dalvik tests
Check test execution flags
Run Dalvik tests
Start CTS tests Run CTS plan
Tests failed
Taming the Android Monster
Tests Succeeded
Run failed cases derived plan
460 440 420 400 380 360 340 320 300 280 260 240 220 200 180 160 140 120 100
Total Time
Date 2013-01-16T6:24:20.000
Success
2013-01-16T6:24:38.000
Time 2013-01-16T6:24:20.000 2013-01-16T6:24:38.000
Run timed out cases
457.00 119.00
Run timed out cases derived plan Tests completed with errors Failed
Figure 1 The Compliance Test Suite for Android can be dramatically shortened with the ElectricCommander’s ability to run large numbers of tests in parallel over distributed compute resources. Without ElectricAccelerator (4 resources) Agents & Jobs debug3 1 2 3 4 0
a.o
f
ccccc
e.a
d.a
:00
:01
:02
:03
:04
:05
cc
ee
:06
:07
:08
:10
:11
ffff
c.o
ff
:09
ccc
cccccc
:12
:13
:14
:15
fff
f.a
cccc
c ee
d.a
ffffff fffff
:15
:16
:17
:18
:19
:20
:21
:22
:23
:24
:25
With ElectricAccelerator (4 resources) Agents & Jobs debug3 1
bb
2
ffff
3
fff
4
f
5
d.a
6
ccccc
7
ccc
8
c
debug4 1
a.o ff
3
cccccc
4
cccc
5
cc
6
ffffff
7
fffff
8
ee
:00
f.a
b.o
2
0
e.a
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 >50
4h28m29.85s 2h14m22.58s 1h29m40.08s 1h7m19.15s 53m54.63s 44m58.23s 38m35.19s 33m47.93s 30m4.40s 27m5.57s 24m39.35s 22m37.42s
Agents
What is true of the general software development process cycle applies in spades to the world of Android. With active Android devices having crossed the line of a billion with no end in sight, and with the variety both of devices, their specialized internal software functionality and the rapid growth of advanced versions of the operating system, the timeto-market pressure on OEMs is getting intense. To address this, Electric Cloud has introduced an integrated software delivery tool especially for Android. Electric Cloud for Android integrates the functions of ElectricCommander, the build acceleration tool called ElectricAccelerator, and the company’s ElectricDeploy tool. Since Android is, mostly, considered open source, developers do have the ability to modify the operating system. One big issue, of course, is if you do this can your product still carry the coveted little green robot? In order to earn that little symbol, Google requires that a device certifiably pass the Compliance Test Suite (CTS), a battery of some 17,000 tests. In addition, most OEMs developing unique products will have suites of tests for their own code as well. The ElectricCommander tool can automatically parallelize the execution of tests over the available resources. Depending on available compute resources, Electric Cloud states that test cycle times for the CTS can be reduced as much as 75 percent (Figure 1). And Ramanathan points out that in the context of time-to-market pressures and the cycle times involved, computers today are a relatively modest investment. He
Minutes
developers for appropriate action and the software can proceed to Q&A testing. Alternatively, it can be deployed for an evaluation iteration and thence back to the development team for corrections and/or enhancements since the agile process does not simply fill predetermined requirements but also builds on feedback between cycles.
c.o
20m54.25s 19m25.72s 18m9.05s
Add more resources to accelerate builds
17m2.53s 16m4.24s 15m15.05s 14m33.48s 13m57.29s 13m18.00s 12m45.17s 12m10.57s 11m51.56s 11m51.94s 11m51.31s (best possible)
0
100
200
Minutes (est.) :01
:02
:03
:04
Time
:05
:06
:07
:08
:09
9m
:10
:11
:12
:13
:14
:15
:15
:16
:17
:18
:19
:20
:21
:22
:23
:24
:25
22m
Figure 2 The ElectricAccelerator can detect dependencies among elements of code and compile at an optimal degree of parallelism while retaining the needed dependencies.
also notes that some customers tend to see the cloud as fairly attractive for test execution in contrast to build opera-
tions, which additionally involve quite a bit more data movement and storage resources than testing operations. RTC MAGAZINE MAY 2013
11
editor’s report
In addition, the ElectricAccelerator in the Android tool can allocate builds across available resources to dramatically reduce build time. Based on the contents of the make file, the ElectricAccelerator can understand the dependencies between components and allocate their compilation over the available compute resources. As shown in Figure 2, a single quad-core machine would
compile the non-dependent elements as resources become available and would build those elements with dependencies in the proper sequence. In the figure, these are represented by the yellow and green elements. Four such machines could dramatically reduce the overall build time to less than half the previous example. Adding even more resources as in the chart
on the right, could bring achingly long build times down quite dramatically. A lot depends, of course, on the size and complexity of the project and the amount of resources one is willing to throw at the problem. ElectricAcclerator includes automatic conflict detection and correction technology to avoid broken builds. It determines which files were used to build every object file, library or executable. Thus when build steps are run out of order, it automatically reruns them in the correct order. It also features visual build analysis and reporting that mines the data to provide a graphical representation of the build structure for performance analysis. Among other features of the Android tool are the ability to execute multiple build-test-release cycles in parallel across multiple versions of Android, tests and platforms. In addition, in tribute to how rapidly this world of Android is changing and updating, an out-of-the-box Gerrit integration tool enables automatic synchronization with the latest Android code base and patches. The world seems to be caught between two conflicting trends: the ever-increasing size and complexity of software and the continual demand for shorter development cycles and time-to-market. It would appear that the only remedy for increasing automation is to increase automation. But that also involves injecting ever more intelligence into the process, which is the true contribution of such tools as these. Electric Cloud Sunnyvale, CA. (408) 419-4300. [www.electric-cloud.com].
12
Untitled-4 1
MAY 2013 RTC MAGAZINE
4/23/13 3:52 PM
(PSRZHULQJ 'LYHUVLILFDWLRQ - with IBASE Embedded Solutions
0HGLFDO
'LJLWDO Signage
,Q 9HKLFOH ,QIRWDLQPHQW
Industrial Automation
,%$6( (PEHGGHG 6ROXWLRQV ZLWK $0' (PEHGGHG 7HFKQRORJ\ 6OLP 3RZHUIXO (PEHGGHG 6\VWHPV SI-08 Â? AMD Embedded G-Series Dual Core APU Â? ,QWHJUDWHG &RUH 'LUHFW; *38 Â? Supports Full HD, CRT/ DVI/ HDMI with Audio Â? )DQOHVV $OXPLQXP &RQVWUXFWLRQ Â? 2SWLRQDO :L )L %OXHWRRWK ([SDQGLELOLW\
SI-38 Â? AMD Embedded R-Series Quad Core APU Â? $0' 5DGHRQTM +' *38 LQ $38 Â? Supports Full HD, CRT/ DVI/ HDMI with Audio Â? 8QLTXH 6HJUHJDWHG 9HQWLODWLRQ 'HVLJQ Â? 2SWLRQDO :L )L %OXHWRRWK ([SDQGLELOLW\
6PDOO )RUP )DFWRU 0DLQERDUGV p 'LVN 6L]H 6%&V Â? $0' (PEHGGHG 1 / 3URFHVVRU Â? &RPSDFW ZLWK /RZ 3RZHU &RQVXPSWLRQ Â? ([SDQVLRQ 6ORW '9, /9'6 $YDLODEOH Â? 4XDOLW\ &XVWRPL]DWLRQ 0DQXIDFWXULQJ Â? 5LFK , 2 RQERDUG *E( /$1 56 6$7$ ,,
COM Express Â? $0' (PEHGGHG 1 / 3URFHVVRU Â? '9, /9'6 $YDLODEOH Â? 4XDOLW\ &XVWRPL]DWLRQ 0DQXIDFWXULQJ Â? 6XSSRUWV 'XDO &KDQQHO ''5 0HPRU\
Mini-ITX Mainboards 3DWLHQW &DUH 0HGLFDO 6\VWHP BST-1850 Â? v /&' ZLWK ,3 )URQW 3DQHO 3URWHFWLRQ Â? %XLOW LQ :HE &DPHUD ZLWK ,QGLFDWLRQ /LJKW Â? &DUG 5HDGHU ' %DUFRGH 6FDQQHU Â? )LQJHUSULQW $XWKHQWLFDWRU 2SWLRQDO
Â? &DSDFLWLYH 3RLQW 0XOWL WRXFK 3DQHO Â? 6PDUW 0RXQWLQJ ZLWK +0, &RQWUROOHU
Â? $0' (PEHGGHG 5 6HULHV 4XDG &RUH $38 Â? $0' 5DGHRQTM +' *38 &RUHV Â? 6XSSRUWV ''5 0D[ *% Â? [ '9, [ &20 [ *E( /$1 [ 86% Â? [ 0LQL 3&, ( [ [ 3&, ( [ [ 6$7$ ,,,
6WHZDUW 'ULYH 6XQQ\YDOH &$ 86$ _ 7HO _ (PDLO LQIR#LEDVH XVD FRP _ www.ibase-usa.com Corporate names and trademarks stated herein are the property of thier respective companies. Copyright Š 2013 IBASE Technology, Inc. All rights reserved.
The Industry Leader in Embedded Computing
Technology in
context
COM for Rugged Environments
Small, Light and Rugged: Qseven Enables New Battlefield Technologies Man-Wearable System Achieves 40 Percent Power Reduction Shifting from Custom to Qseven Platform by Dan Demers, congatec and Michele Kasza, Connect Tech
A
s military electronics become lightweight systems that are highly cusmore mobile and more complex, tomized to military applications. These designs that combine form, func- small platforms are driving big changes tion and performance are in demand on in the way military designs are develtoday’s battlefield. Unique applications oped, fueling a shift to standardized and challenging tactical environments technologies that not only deliver permay complicate the design process, but formance, but also allow integrators to the resulting systems are invaluable to compete more aggressively for military soldiers in the field. Today, man-porta- contracts. Getting these powerful sysble systems go where the action is and tems out into the field quickly is essenmust perform flawlessly—managing tial. They provide soldiers with a new high volumes of sensor data, sharing in- level of situational awareness, allowing formation in real time and performing them to control robotic support from a nies providing solutions now complex computing tasks in extreme safe distance, or enabling real-time vion into products, technologies and companies. Whether your goal is to research the latest sual simulation sophistion Engineer,tactical or jump to aenvironments. company's technical These page, the goal of Get Connected is to put youand training long before they reach the front lines. ticated, form factor systems deyou require for whateversmall type of technology, and productsmand you are searching for. an optimal computing platform, ensuring performance, meeting appli- Qseven Sees Battlefield Action cation needs and handling the range of Systems designed for field mobilrugged requirements that define mili- ity must be sturdy, rugged, compact tary computing. and lightweight. Furthermore, a design Computer-on-Modules (COMs) are focus on power efficiency ensures that playing a significant role in enabling these devices are well-suited to batterythese ultra-portable innovations; by le- powered applications. Small size and veraging performance advantages of low power consumption might once the latest Qseven standard, designers have defined a non-essential device, but can quickly develop high-performance, today these factors are vital to portable field computing as illustrated by Quantum 3D’s Thermite TL 2000, a ruggeGet Connected with companies mentioned in this article. dized embedded computer system dewww.rtcmagazine.com/getconnected signed for man-portable applications.
loration our goal k directly age, the source. ology, products
End of Article
14
MAY 2013 RTC MAGAZINE Get Connected with companies mentioned in this article.
This rugged computer system is a thin and lightweight device that provides processing performance with extended battery life for man-wearable graphics and video-intensive applications (Figure 1). This is particularly important for command, control, communications, computers, intelligence, surveillance and reconnaissance (C4ISR) missions. In addition, field-based training, mission planning, mission rehearsal, weapon system control, maintenance, robotics and other mobile applications can be controlled from this advanced portable device worn by individual soldiers or teams of soldiers, in the field or learning to make tactical decisions in battlefield environments. Quantum’s Thermite TL 2000 is a next-generation product based on an earlier version of the Thermite TL system. The previous Thermite system was developed as a full custom design; however, the Thermite TL 2000 jumped to the Qseven standard. “After our initial design of the Thermite TL, customer feedback indicated a strong desire for a smaller form factor system with significant horsepower and capabilities. It had to be sufficiently small, light and powerefficient to be put into a man-worn en-
technology in context
vironment. Thermite’s promise was in delivering rugged computing performance in such a small size that it could be physically worn by soldiers, and still handle demanding applications such as robotics or aircraft control,” said Pratish Shah, vice president of sales and marketing, Quantum 3D. “We took a thorough look at how we could provide the greatest value—focusing on ruggedizing the product and achieving a significant reduction in power consumption—and concluded we could gain a strategic advantage by leveraging third-party capabilities along with existing standards.” Quantum 3D determined that congatec’s Qseven module would provide the form, function and performance required by customers (Figure 2). The Qseven embedded computer module is a solution for virtually any low-power or ultra-mobile embedded PC application due to its compact size, minimum power consumption and low cost.
Transitioning Designs from Custom to Qseven
Quantum 3D had several key criteria to address in transitioning Thermite TL to Thermite TL 2000. Thermite’s existing silicon option was reaching end of life, and there was an overall desire to reduce size, weight, power and heat dissipation. First on the list was gaining CPU performance as measured by a number of applications put forth by top customers. Specifically, it was essential that the Thermite TL 2000 achieve a certain level of video frame rate and playback. Next up was reducing power consumption by as much as 40 percent. Many end-user applications are battery-powered and an advantage goes to the device that can make its battery last as long as possible while still achieving a high level of system performance. And perhaps most importantly, these advancements had to be invisible—the form factor was evolving but the physical case and external connectors had to
Figure 1 Sturdy, rugged, compact, lightweight and power-efficient, Quantum 3D’s Thermite TL 2000 is designed for use in any environment. By combining embedded computing, conduction-cooling technologies and support for openarchitecture operating systems, the Thermite TL 2000 brings performance and capabilities power to the battlefield.
remain 100 percent identical to the earliest Thermite product generation. With a deployed solution like Thermite, existing customers could only capitalize on the performance advancements of the TL 2000 if their systems could maintain all mechanical form factors that had been specifically developed to the shape, size and weight of the existing device.
Carrier Board Expertise Reduces Development Time
For Quantum 3D, Qseven COMs served as modular building blocks to support a much faster design and development process. Qseven COMs are widely available in a well-developed ecosystem, and simplify the customization process for embedded design. The COM’s related carrier board holds the key to customized performance, containing all customization related to the end-user application. For future product planning, upgrades
are possible by switching out the Qseven module; while customization must be adapted to any new silicon, it is an integration process rather than costly and time-consuming redevelopment. Connect Tech had an off-the-shelf carrier board that came very close in offering the feature set and size required for Quantum 3D’s Thermite TL 2000. Connect Tech was able to quickly execute proof of concept using off-the-shelf hardware, and then determine the applicationspecific needs for the final board. Carrier boards work in conjunction with COMs, providing the instant access to current embedded processors that are easily upgradable to accommodate future generations. Connect Tech Qseven carrier boards address a variety of feature set requirements and accordingly offer a variety of embedded processor options including Intel Atom, Freescale i.MX51, TI OMAP and NVIDIA Tegra. Carrier boards are RTC MAGAZINE MAY 2013
15
technology in context
Thermite TL 2000’s StandardsBased Future
Figure 2 In contrast to its COM Express counterpart, Qseven is specifically designed for mobile and battery-operated applications, delivering a maximum power consumption of 12 watts. Qseven has abandoned all legacy interfaces such as parallel IDE and PCI bus to reduce complexity and cost. In step with its mobile focus, Qseven incorporates only the latest interfaces and is intended to support performance of future mobile chipset/CPU combinations.
intentionally not tied to any specific bus architecture, and offer Mini-PCIe and SIM-card expansion capability and designer’s choice of Mini-PCIe peripherals, including Wi-Fi, GPS, Bluetooth or storage. Customization can be flexible and depends entirely on the goal of the system and the end-user application (Figure 3). Investments in customization, such as defining the I/O mixture, continue as the life of the product progresses. As Thermite TL 2000 upgrades to next-generation processors, none of the main hardware needs to be changed, and the customization and feature set simply moves forward with the performance advancement. Carrier board design requires experience in routing PCI Express, testing signal integrity and mechanical stability for the range of high-speed interfaces. Further,
16
MAY 2013 RTC MAGAZINE
mounting within the enclosure demands knowledge of thermal management and heatsinking techniques, which are essential to the small and confined physical space of this device. Once engaged in the custom conversation, Connect Tech was able to provide Quantum 3D with mechanical samples in five weeks; at week eight they provided functional, tested prototypes along with a customized and improved thermal design solution. Common connectors were not rugged enough for Thermite TL 2000’s requirements, so Connect Tech developed a connector-free carrier board. Only the MXM connector, which connects the Qseven COM to the carrier board itself, is present in the design; the board simply uses an edge connector to plug into Quantum 3D’s Thermite system.
As silicon architecture evolves to include new processors, the customization of the Thermite TL 2000 Qseven COM carrier board combination remains intact from generation to generation. Ruggedized elements, connectors and form factor are all unaffected, and customization can be reused with the modified COM without costly redesign. Qseven COMs are a powerful tool in the military design challenge—meeting rugged performance standards with all the ease of a pre-integrated system. Created to support small-sized, lowpower, mobile and ultra-mobile applications, the Qseven module measures just 70 x 70 mm² and does not require an expensive board-to-board connector but rather an inexpensive, yet reliable, 230-pin MXM edge connector known from mobile graphic cards. The thermal design power (TDP) called out in the Qseven specification is limited to 12W, but even more significant is the specified supply voltage at 5 volts, allowing a mobile device to run efficiently on 2 lithium cells. Qseven supports no legacy I/O such as 32-bit PCI or IDE. Instead, it focuses on current I/O such as PCI Express and digital display interfaces. Both the x86 and ARM architectures are supported in the Qseven specification. This allows for a wide range of options and scalability for designers of military-based computing platforms. As Thermite TL 2000 moves to future product generations, military customers will undoubtedly require continued improvements in processing and performance. “As customers begin to demand even less power usage, we will examine different silicon architectures that are now available on the Qseven standard,” said Shah. “Quantum 3D will continue to explore Intel’s faster performance options as they become available, and also extend into ARM-based and other CPU architectures that may enable new capabilities and features that were not previously possible.”
technology in context
Partnerships Help Solve Design Challenges
By making the leap to Qseven, Quantum 3D was able to focus primarily on its core competency of ruggedizing the Thermite TL 2000’s design. Low power and high performance was ensured using an energy-efficient Intel Atom E680 processor; the device delivers up to 2 Gbytes of system memory in a compact, rugged enclosure. Power-efficient design includes dynamic power control to manage power consumption and thermal dissipation. Thermite TL 2000’s modular architecture is flexible, enabling military integrators to tailor systems for specific CPU performance, video processing and 2D/3D graphics, networking, I/O and data storage. I/O is versatile with six USB ports, RS-232C, audio and video options, and 10/100 Ethernet ports for high-speed signal processing. Depending on the end-user application, Thermite TL 2000 can include up to 60 Gbytes of ruggedized solid state storage. Quantum designed Thermite TL 2000 for rugged performance from the ground up; conduction-cooled with no moving parts, validated to MIL-STD-810G to ensure resistance to extreme temperature, vibration, shock and immersion, and MIL-STD-461E to handle radiated emissions. MIL-SPEC chassis and connectors create the proper construction for rugged, reliable performance. In addition to keeping Quantum 3D’s focus on rugged design elements, the standards-based Qseven platform allowed the company to deliver faster than competitors. “Qseven allowed us to focus on Thermite TL 2000’s ruggedization, meeting the extreme specifications of our military customers,” said Shah. “We concentrated our resources on validating performance in extreme temperature ranges, under the effects of immersion and high altitude, and in conditions of shock and vibration. Because we were able to move more quickly through this development process, we were also able to tap into a few additional military programs and opportunities. Qseven enabled us to not only deliver a rugged, high-performance product, but also to be more responsive and competitive in military design initiatives.”
Figure 3 Connect Tech’s PCIe/104 Qseven Carrier Board is a small embedded carrier board that allows complete integration with any industry standard Qseven module. This carrier board utilizes the PC/104 form factor with 4 x1 PCIe lanes and the PCIe/104 bus. The onboard connectors enable connection to SATA, USB, Ethernet, LVDS Video, VGA Video and RS-232 & RS-422/485.
Qseven Proves its Mettle in Military Design
Small size, rugged high performance and low power consumption are hallmarks of portable, military design. Technology here follows a very similar path to consumer electronics in that devices are shrinking and becoming more and more sophisticated. Military applications take that performance to the extreme with devices that demand longer battery life and the ability to handle huge amounts of application data—and further factors in elements such as extreme heat, high altitude, corrosive sea water, shock and vibration, or all of the above. Qseven COMs help designers address these challenges, while speeding time-tomarket, allowing developers to focus on core competencies and providing a cost-effective upgrade path for long military deployments. This is important for designers, as growth in ultra-portable military systems is a trend likely to continue. A recent study from market research firm ASD Reports estimated a $2.77 billion global market for man-portable military electronics in 2013. Even with looming cuts in defense spending, long-term troop deployments have essentially served to validate the effec-
tiveness and necessity of these types of devices. Designers developing military systems face some of the most significant demands in the realm of embedded design. Whether the system is designed for ground vehicles, aircraft, shipboard computing, or the broad range of manportable applications, COMs act as modular building blocks for rugged electronics. Offering a high-performance, small form factor platform— with the longevity provided by a flexible processor upgrade path and carrier board customization that can endure through future product generations— Qseven COMs are an ideal option for rugged military designs. congatec San Diego, CA. (858) 457-2600. [www.congatec.com]. Connect Tech Guelph, Ontario, Canada. (519) 836-1291. [www.connecttech.com]. Quantum 3D San Jose, CA. (408) 600-2500. [www.quantum3d.com].
RTC MAGAZINE MAY 2013
17
Technology in
context
COM for Rugged Environments
Rugged, Reliable and Real: COMs Broaden Their Reach With the reinstitution of VITA 59, Computer-On-Modules (COMs) are finally making their mark, especially in rugged environments where their small size, low power and high computing strength are increasingly in demand. by Barbara Schmitz, MEN Mikro Elektronik
D
uring their introduction in the early 2000s, the true “standard” COM never materialized, as each manufacturer fragmented the parameters, essentially negating the very principles on which COMs were built. The concept never fully caught on. Then, there was the durability factor. While COMs proved to be cost-effective and flexible, their construction didn’t hold up to the growing number of rugged applications that were creeping into embedded computing platforms. The concept was limited in its implementation. Despite its initial setbacks, the COM concept is based on some very solid, very beneficial principles. Simply put, COMs are complete computers on a plug-on module, with the functionality individually tailored to each application by configuring the I/O on a separate carrier board. Since only the carrier board for the application has to be developed, system costs can be cut significantly, while optimizing time-to-market. Recent developments in the concept have been aimed at combatting the initial stumbling blocks to promote wider use of this cost-effective, modular technology.
18
MAY 2013 RTC MAGAZINE
Figure 1 Areas such as Precision Farming are outcroppings of existing markets, fueling more widespread use of the rugged COM concept.
Bridge the gap between ARM and x86 with Qseven Computer-on-Modules One carrierboard can be equipped with Freescale® ARM, Intel® Atom™ or AMD® G-Series processor-based Qseven Computer-On-modules. conga-QMX6
conga-QA6
conga-QAF
ARM Quad Core
Intel® Atom™
AMD® G-Series
ͻ Small Form Factor
ͻ Low Power
ͻ Interchangeable
ͻ Industry Standard
ͻ Low Cost
ͻ Scalable
ĚĚŝƟŽŶĂů ĚĞƚĂŝůƐ Ăƚ͗
www.congatec.us
congatec, Inc. 6262 Ferris Square | San Diego | CA 92121 USA | Phone 1-858-457-2600 | sales-us@congatec.com
technology in context
Module-based designs have decreased risk factors, allowing for: Reduced hardware and software design time to meet your deadline Predictable outcomes with concurrent testing and qualifying Risk-free upgrading by replacing the COM without touching the board Open standards with multi-vendor hardware and software support Continuous board monitoring and management with SEMA technology
Continuous board monitoring and management ADLINK’s Smart Embedded Management Agent, a set of deeply embedded functions built into all ADLINK modules, offers vital information and control functions to enable board management and early failure detection.
Tel: +1-408-360-0200 Toll Free:+1-800-966-5200 Fax: +1-408-360-0222 Email: info@adlinktech.com
20
Untitled-1 1
MAY 2013 RTC MAGAZINE
5/2/13 9:22 AM
Figure 2 VITA 59 brings the cost-effectiveness of COMs to safety-critical environments via ruggedization of the module.
First, the implementation of a standardization process through the VITA organization is setting the design principles in stone, making the cost benefits of reduced design efforts a reality. Second, the inclusion of ruggedized design principles into the modules themselves has opened up a new world of applications.
Success Is Seen in Application Growth
Termed ESMexpress in its early days before moving into the official standardization process, the concept is now referred to as VITA 59 Ruggedized SystemOn-Module Express (RSE), and is quickly gaining a foothold in many newer applications that are primed for rugged, compact and reliable computing components. One of the largest indicators of VITA 59’s pending success is the increasing number of uses for these cost-effective modules. Technically, it’s not the actual markets that are new, necessarily—it’s how this concept is able to be implemented that is fueling its growth (Table 1). Take crop farming, for example. While commercially growing produce
has been around for decades, the uprising of intelligent farming is a newer trend within this particular industry. Machinery historically thought to just “turn over the dirt” is increasingly being developed for “Precision Farming” applications. A touch display computer can now easily control farming machines like combine harvesters. Mobile telemetry enables GPS-controlled fleet management to optimize driving on large fields and coordinate escort vehicles. The control system can manage data acquisition to accurately gather—among other things—positional data, quality of harvest, actual humidity conditions, etc., as well as coordinate data storage to make the gathered data usable for tasks such as optimized fertilizer distribution (Figure 1). The electronics inside these types of panel PC control units can implement a standard VITA 59 module on a customized carrier board. And depending on the performance requirements of different farming vehicles, the design flexibility of the rugged COMs allows the use of either an Intel Atom-based module or a Core 2 processor, for instance.
technology in context
Typical Harsh Environments for ESMexpress Controls for trains, airplanes, ships Servers
Switches
HMIs in commercial vehicles and construction machines Buses
Trucks
Concrete mixers
Cranes
Tractors
Oil rigs
Industrial control systems DIN-rail based machine control
Robot control
Nuclear power plant
Clean room lasers
Mobile test systems Cars
Commercial vehicles
Outdoor security systems Video surveillance systems for homeland security
Wireless/fixed access control
Public transportation infotainment systems Trains
Underground
Buses
Stationary infotainment
Passenger terminals
Express-IBR
COM Express® Type 6 with 3rd Generation Intel® Core™ Processor
Traffic control & citizen systems Mobile medical equipment Imaging systems (CT, ultrasound)
Patient monitoring
Treatment and anesthesia
Laboratory engineering
Red-light control
Speed control
Police radios
Emergency phone networks
nanoX-TCR
TABLE 1
Easier replacement of one module with another of compatible form, fit and function has enhanced design flexibility in compact packaging, extended application versatility, reduced development costs and optimized time-to-market. This is a concept that holds benefits for a wide number of applications. This rugged, standardized COM concept is easily transferable to more traditional industrial control and monitoring applications, including high-volume, precision operations, like motor vehicle production, or in a safety-critical situation, such as a nuclear power plant. Rugged is one factor of reliability in the computing world, and the design principles of VITA 59 have enabled these modules to withstand exceptional environmental factors, while adhering to common industry parameters. Computer equipment used for railway and airplane travel, for example, must be manufactured according to relevant standards, which include certain levels of ruggedization. Safety requirements up to Safety Integrity Level (SIL) 4 become involved if the electronics are used in critical operational functions such as automated train controls, brake control or
COM Express® Type 10 Mini-size with Intel® Atom™ Processor E6xxT
door closing mechanisms on trains, buses, subways and streetcars, or in air traffic control (Figure 2). With ruggedized COMs able to withstand the harsh conditions present in these applications, embedded designers can now take advantage of the cost-savings and modularity of the VITA 59 modules in a wide range of applications. The initial benefits of COMs are now reaching farther and farther into embedded computing. Medical engineering is another area where safety-critical systems are paramount. Portable imaging systems, such as CT and ultrasound machines, are becoming more popular, as are remote monitoring systems that can monitor, anesthetize or treat patients. In the field, medical operations are also involving rugged COMs, as equipment needs to be lightweight, compact and reliable, while being easy to move from an accident site to an emergency vehicle to a hospital room without missing a beat.
LEC-3517 + LEC-BASE
SMARC-based LEC-3517 module with low power ARM CPU + SMARC Reference Carrier Board
Why it Works
The new rugged COM design standardizes module size and pin connections while enabling CPU choice according to application needs—such as graphics-ori-
Untitled-4 1
Tel: +1-408-360-0200 Toll Free:+1-800-966-5200 Fax: +1-408-360-0222 Email: info@adlinktech.com
RTC MAGAZINE MAY 2013
21
5/2/13 3:21 PM
technology in context
Top Cover
GapPad (6x)
ESMexpress PCB
Frame
Screws DIN 965 - M2x14 (8x)
Figure 3 Diagram of the cooling structure used with VITA 59.
ented or computation-intensive requirements. The easily customizable modules plug onto carrier cards to provide unique I/O connections and other functionalities required for specific applications. But the most significant point of differentiation in this new standard is that specifications for the plug-on module address attributes necessary to deliver mechanical stability, rugged physical performance and aggressive fanless cooling for heat dissipation—even in harsh operating environments. VITA 59 modules are built on a 105 mm x 135 mm form factor, leaving a full 95 mm x 125 mm of usable board space for electronic components. That form factor is compatible with COM Express carrier boards and allows users to employ existing modules in either format, within the appropriate application environments. The essence of the new VITA 59 standard is in the design specifications. These rugged COMs can function in more extreme application environments not traditionally supported by COM Express or the majority of the other COM concepts. By adding a surrounding PCB area for conduction-cooling and mounting space as well as a rugged, plug-compatible COM Express connector to the existing COM Express concept, VITA 59 provides an easy method of converting
22
Untitled-1 1
MAY 2013 RTC MAGAZINE
4/25/13 9:27 AM
COM Express to a ruggedized version with minimal effort. The VITA 59 design is simply a rugged mechanical extension of COM Express boards. An aluminum housing covering the module’s PCB provides EMC protection conforming to EN 55022. And for particularly harsh operating environments, where dust or moisture is a potential concern, conformal coating is optional for added protection. Fully soldered connections provide shock-resistant and vibration-resistant performance rated to withstand 15 g/11 ms for shock and 1 g/10 Hz…150 Hz for sinusoidal vibration. The design includes many other mechanical and thermal features to protect the plug-on module, while providing conduction- and convection-cooling in fanless environments. Although VITA 59 sets a threshold of 35W for power dissipation, many of the specific CPUs used in current rugged COM modules actually draw only a fraction of that power. These low-power processors improve compatibility for mobile applications and minimize cooling concerns. The aluminum housing that protects the PCB combines with a metal frame and mechanical connections that contribute to the thermal performance of the overall design. Both the PCB itself and the metal
frame help to draw heat away from the processor and transfer it to the cover. The hottest components within the unit are also coupled directly to the cover. Eight mounting screws secure the entire physical assembly in place and ensure a positive connection for optimum heat transfer. The aluminum housing can be connected to an external heat transfer device (conduction) or combined with a heat sink for heat dissipation (convection) should additional cooling of a module be required (Figure 3).
A Wealth of Modern Interfaces
Aside from the construction, the rugged COMs account for today’s computing requirements as well. Modern serial buses, without switched fabrics, provide high reliability for harsh-environment and missioncritical applications and are compatible with a variety of standard operating systems. The extensive range of communications protocols supported by the standard as well as chipsets and available carrier boards help enable the use of rugged COMs in a wide variety of applications. COM Express pin assignments guarantee interoperability between any VITA 59 module and various COM Express and VITA 59 carrier boards. Embedded designers have a host of interface options at their disposal: • Up to eight USB 2.0 host ports (or seven host ports and one client port, adjustable by software) provide data rates up to 480 Mbytes/s, offering flexibility for a wide range of end-use functions. • Up to three Serial ATA (SATA) communications ports, available through the VITA 59 connector via a PATAto-SATA converter, support RAID functions and provide data transfer rates up to 100 Mbytes/s. • For PCI Express there are four single-lane ports (4 x1) and one port that can be configured as 1 x16, 1 x8, 2 x4 or 2 x1. • Ethernet options provide up to three 1 Gbit ports of 1000Base-T (also 10 Gbit). • Full-featured SDVO and LVDS ports support graphics-oriented applications. • High-definition audio is available via the VITA 59 connector.
• Extendable I/O, controlled through FPGAs, provides ample flexibility to configure the specific I/O requirements. With the FPGA functionality provided on the carrier board via a PCI Express link instead of on the COM module itself, the number of pins reserved for FPGA functions are less than if the FPGA resided on the COM module. • Display port and HDMI support allows for a second GMBUS defined on PCIe x16 split signals. A low-voltage I/O mode (3.3V or 1.5V I/O) is now available with the I/O reference voltage indicated by the CPU board on the J1-43 pin. Serial Rapid IO is supported on all pins formerly dedicated to PCI Express. Additional work is currently being done by the standards committee to define compliant mechanics to VITA 30.1 that would allow for the use of modules with CompactPCI serial and VPX by substituting the module frame with an integrated frame on the carrier board. The VITA 59 RSE standard opens up a wealth of technological capabilities for mission-critical and mobile applications previously not addressed by COM designs. As embedded system applications move farther afield and demand longer life in mobile and harsh applications, ruggedization and reliability will become key concerns for the design engineer. With affordability a close third on the list of design concerns, the ability to segregate the more complex, critical CPU functions onto modules with standard pin assignments, regardless of their function, is also imperative. Application-specific functions can be resolved on easily adapted carrier boards bringing cost-effective functionality together with targeted, application development. The new design of the COM concept puts computing flexibility and rugged, reliable design in the hands of embedded engineers, while keeping development costs low and enabling smart, cost-effective upgrades. MEN Micro Ambler, PA. (215) 542-9575. [www.menmicro.com].
nmicro.com/cpci-serial..........www.menmicro.com/cpci-serial..........www.menmicro.com/cpci-serial..........www.menmicro.com/cpci-serial..........www.menmicro.com/cpci-serial..........www.menmi
technology in context
Untitled-8 1
Want VPX? Try CompactPCI
®
Serial !
Individual 19“ System Solutions as inexpensive industrial PC, redundant system or complex computer cluster Q
CompactPCI® PlusIO: 100% compatible to CompactPCI® 2.0
Q
CompactPCI® Serial: PCI Express®, Ethernet, SATA, USB on the backplane
Q
Our products: Intel® and PowerPC® CPUs, I/Os such as WiFi, USB 3.0, Fiber Optics, etc., XMC/PMC carriers, HDD shuttles, PSUs, racks
Embedded ded Solutions olu – Rugged Computer ute Boards uter oar and Systems for Harsh, Mobile h, M ile and Mission-Critical Environments iron ron nts
MEN Micro, Inc. 24 North Main Street Ambler, PA 19002 Tel: 215.542.9575 E-mail: sales@menmicro.com www.menmicro.com/cpci-serial
RTC MAGAZINE MAY 2013
23
8/3/12 4:47 PM
Technology
connected Strategies for Fabrics
PCI Express Fabrics Break Through I/O Limitations Integration of PCIe 3.0 with VPX platform improves performance tenfold with no porting efforts. by Vincent Chuffart, Kontron
M
atching CPU and I/O performance is an ongoing challenge for evolving industrial embedded designs. Highperformance data processing applications require greater and greater processing capabilities that must not be bottlenecked with limited I/O performance. Sensor processing applications such as video compression, medical imaging, wireless communications, and test & measurement are examples of fast growing areas that require a next-generation design solution to effectively correspond I/O bandwidth with processing speed. VPX is gaining ground in these industrial scenarios—ushering in a new era of rugged embedded computing, and allowing board computers to move away from decades of parallel bus architectures that seriously limited I/O performance. VPX connectors and backplanes can carry multiGigahertz signals and enable systems where the bandwidth is no longer shared between boards. Combined with Generation 3 Peripheral Component Interconnect Express (PCIe 3.0), the result is a new breed of unparalleled applications for high-performance data processing platforms.
Redefining Embedded Performance
Initially designed to replace the older PCI and PCI-X, PCIe is a computer expansion card standard used throughout the computing and embedded devices
24
MAY 2013 RTC MAGAZINE
Figure 1 The Kontron VX3042 and VX3044 Intel Core i7-based SBCs leverage PLX’s PCIe Gen3 switching technology along with Kontron’s exclusive VXFabric software. In addition to having two 10 Gigabit Ethernet channels already featured on the boards, VXFabric implements TCP/IP over PCI Express as a second data plane for higher-performance embedded computing. This combination of features enables efficient system convergence, as all devices and subsystems offer native PCIe, which permits immediate use of an existing infrastructure, thereby lowering latency, cost and power. Kontron VXFabric provides the software between the PLX ExpressLane switch and the bottom of a standard TCP/IP stack, which allows the boards to use their existing TCP/IP-based application without having to be modified.
industries. PCIe provides a high-speed, high-performance, point-to-point link for interconnecting devices, a design essential for data-hungry industrial applications and an alternative to the system-wide shared parallel bus architecture. Now in Gen3, PCIe technology has doubled the throughput per lane from Gen2’s 4 Gbits/s to 8 Gbits/s. Although the effective raw bit rate has only increased from 5 GHz to 8 GHz, by optimizing the efficiency of the raw encoding from 8b/10b to 128b/130b, the bandwidth is actually doubled. In addition, PCIe 3.0 maintains backwardcompatibility with previous generations. Developers of PC interconnects, graphics adapters, chip-level communications and others are enabled with even greater performance capabilities, matching CPU and I/O performance ratios. Consider all the data flowing within an M2M application on a diverse range of devices; PCIe 3.0 receives and transmits data on separate sets of signals, relieving data bottlenecks created by attempts to process an increasing amount of data in real time. Industrial applications such as medical, infotainment, digital signage and M2M/HMI are poised to benefit greatly from the combination of the VPX platform and PCIe 3.0, due to performance advances made possible by the resulting tremendous burst of I/O bandwidth and processing power.
technology connected
Furthermore, thanks to broad IT market adoption of the technology, PCIe 3.0 is emerging as a leading serial link technology for VPX. High Performance Embedded Computing (HPEC), a typically mil/ aero arena that is the first domain to benefit from this architecture, is realizing a tenfold increase in I/O bandwidth between computing boards offered by VPX backplanes. This holds significance for industrial embedded designers, and addresses an I/O performance gap so dramatic that only a fraction of current HPEC market applications really see significant application benefits in terms of accuracy, signal to noise improvement, cost and compactness of embedded computers. PCIe 3.0 allows scalable, simultaneous, bi-directional transfers using 1 to 32 lanes of differential-pair interconnects. By grouping lanes, this standard can achieve high transfer rates similar to graphics adapters. Up to 32 Gbyte/s of bi-directional bandwidth on a x16 connector can be obtained with PCIe 3.0. It also enables low-overhead, low-latency data transfers. With both host-directed and peer-to-peer transfers, emulation of network environments can send data between two points without host-chip routing. These features make PCIe 3.0 an ideal solution not just to link high-bandwidth I/Os to a processor unit, but also to become a native communication link between computing devices in a multiprocessor environment, such as reconstructing 3D images from high-definition image sensors. PCIe 3.0 does however require a specific skill set in order to offer the complete temperature range required in industrial applications. But, once mastered, the same bandwidth can be offered with fewer PCIe lanes—in turn accommodating small backplanes (3U) for smaller computers with the same performance. For example, x16 first-generation PCIe offers the same bandwidth as x4 PCIe 3.0, but requires four times the lanes on a backplane.
Protecting Embedded Software Investments
An existing distributed application might exchange information on gigabit Ethernet, implemented on most VME or CompactPCI platforms today. Alternatively, VPX platforms can implement
TCP/IP protocol over the PCIe 3.0 infrastructure. The VPX system would incorporate VXFabric, Kontron’s open infrastructure that implements efficient interboard communication at hardware speed, to tap into high-speed PCIe 3.0 bandwidth for data transfers simply by selecting a different IP address to connect to the other boards. No change is needed in software coding. This combination of technologies insulates applications from the complex, low-level details of the current generation of PCIe silicon management, and further prevents software from obsolescence. VXFabric enables the use of standard communication protocols such as TCP/IP or UDP/IP based on its socket API. The API provides a thin layer of software that allows faster application development for IP-based transport over PCIe 3.0. From a hardware point of view, the architecture is based on several CPU boards, each featuring several processing cores, interconnected through PCIe via the VPX backplane, and using a PCIe 3.0 switch. Through VXFabric, any industrial application based on TCP/IP will run unmodified on this platform. PCIe 3.0 switches offer the ability to combine different data types in a single converged pathway. Data (compute, communication, or storage) is created and consumed as PCIe on each of the slots in the rack, delivering efficiency both in hardware architectural and software usage. For example, built around PLX ExpressLane PCIe 3.0 switches, Kontron’s VX3042 and VX3044 Intel Core i7-based single board computers (SBCs) routinely achieve 5.6 Gbytes/s in data throughput between any boards in a VPX rack (Figure 1). From a software perspective, VXFabric offers the equivalent of an Ethernet network infrastructure, including the IP socket programmatic interface, implementing layers that allow direct access of classic protocols such as TCP or UDP. Development and migration efforts are streamlined since the API requires no modification of existing applications. The end-user application does not even know it is using VXFabric or PCIe, but instead sees its usual TCP/IP sockets, just like a common Internet- or cloud-based application. This reduces development efforts and simplifies migration to VPX for in-
dustrial applications evolving to greater image or video processing performance, sensor data processing, or more rugged deployments such as outdoor digital signage or M2M implementations (Figure 2). Due to the PCI fast link’s plug and play capability, the switch fabric moves data at an ultra-high speed. This solution enables deployment of high-performance—up to 6U OpenVPX—solutions, with better than 10 Gbit/s board-to-board connectivity. At the same time it facilitates integration of next-generation processor architectures. Further, PCIe’s performance as a native data bus in all modern processor chipsets delivers a key advantage—a broad PCIe-based software ecosystem with well-developed support for peripheral interconnects.
The Value of Switch Fabric
PCIe 3.0 in VPX systems relies on point-to-point connections between boards to manage high-bandwidth traffic. These connections require backplane routing specific to each application, in order to create connections between boards matching the target application data flows. The use of a switch-based fabric approach allows designers to seamlessly implement all routes to and from boards dynamically—in turn offering ample bandwidth to any application’s data flow, all with the same hardware. Using this design approach, OEMs and their customers optimize total cost of ownership and further maintain a direct migration path forward from existing applications deployed today. Switch fabric enables a cost-effective bridge between current Gigabit Ethernet on the backplane (in industrial platforms such as VME, cPCI, VPX) and the next data plane generation of 10G and 40G Ethernet. Industrial applications seeking a performance jump can today access 10G and 40G performance in compact VPX-based systems featuring low power consumption and harsh environment capabilities; these same systems address all fast and low latency peer-to peer inter-computer node communication within a chassis. This is a sea change for industrial computing, taking next-generation applications well beyond Gigabit Ethernet capabilities. The I/O performance advanRTC MAGAZINE MAY 2013
25
PCI Express, PCI, and ISA Experts RTD Designs and Manufactures a Complete Line of High-Reliability Embedded Products & Accessories
AS9100 and ISO9001 Certified 4JOHMF #PBSE $PNQVUFST t t t t t
".% BOE *OUFM 1SPDFTTPST 3VHHFE 4VSGBDF .PVOU 4PMEFSFE 3". -BUFTU * 0 5FDIOPMPHJFT 0OCPBSE *OEVTUSJBM 'MBTI %JTL UP „$ 0QFSBUJPO
%BUB $PMMFDUJPO .PEVMFT t t t t t t
4QFDJBMUZ .PEVMFT t t t t t
t
1PSU &UIFSOFU 4XJUDI %FMUB 4JHNB "OBMPH * 0 )PU 4XBQQBCMF 3FNPWBCMF 4"5" 6TFS $POGJHVSBCMF .JOJ 1$*F %JHJUBM 4JHOBM 1SPDFTTPST
t
"VUP $BMJCSBUJOH "OBMPH * 0 "EWBODFE %JHJUBM * 0 4JNVMUBOFPVT 4BNQMJOH )JHI 4QFFE .D#41 1VMTF 8JEUI .PEVMBUJPO *ODSFNFOUBM &ODPEJOH 0QUP *TPMBUFE .04'&5 7JSUFY BOE 4QBSUBO '1("
1FSJQIFSBM .PEVMFT t t t t t t t t t
1PXFS 4VQQMJFT t t
)JHI &GGJDJFODZ 1PXFS 4VQQMJFT 6OJOUFSSVQUJCMF 1PXFS 4VQQMJFT
t t t
.BTT 4UPSBHF .PUJPO $POUSPM 4ZODISP 3FTPMWFS 7JEFP $POUSPM 'JSF8JSF 64# 64# $"/ #VT $"/ 4QJEFS (JHBCJU &UIFSOFU (14 (4. (134 &%(& .PEFN 8JSFMFTT 5FMFNBUJDT
Copyright © 2013 RTD Embedded Technologies, Inc. All rights reserved. All trademarks or registered trademarks are the property of their respective companies.
XXX SUE DPN r TBMFT!SUE DPN
#VT 4USVDUVSFT t t t t t
1$*F 1$* &YQSFTT 1$* 1$ Plus 1$
The products above are just a sampling of RTD’s board-level and ruggedized packaging solutions. From low-power to high performance, RTD can tailor a system for your mission-critical application. Visit www.rtd.com to see our complete product list.
AS9100 & ISO9001 Certified
35% &NCFEEFE 5FDIOPMPHJFT *OD
technology connected
Application
User
Socket Kernel
TCP/IP STACK VXFabric
Ethernet D Driver
PCIe Hardware
Ethernet Hardware
Figure 2 Kontron’s VXFabric allows existing applications written for TCP/IP sockets to use PCI Express for higher bandwidth communication. No code change required. The VXFabric code behaves for the system as an Ethernet device. Physically, on each Kontron board there is a PLX non-transparent bridge chip. A narrow collaboration between PLX and Kontron was necessary to use the PLX silicon features at their maximum. And each Kontron VPX board implements this part and can offer top TCP/IP performance on the backplane.
tages from VPX and PCIe 3.0 equate to exchanging the data of one DVD per second between boards. This promises to enable significant advances in size and resolution of images, for example, leading to improvements in imaging applications in manufacturing or medical environments. The rugged small size of these systems also means these features can be more readily used in mobile environments. During the last 15 years, only Gigabit Ethernet evolved to enhance intra-communication bandwidth on backplanes. The 70MByte/s exchanges on VME backplanes were replaced by 120 Mbyte/s exchanges; this is the VITA 31 standard, also on the backplane, or with cables on the front. Alternative technologies such as InfiniBand or RapidIO were made available to offer more bandwidth. However, these more niche market technologies will not survive as long as mainstream technologies such as TCP/IP and PCIe. This is due partly to their development cost as well as their inability to offer Ethernet’s guarantee of a long performance lifetime. In turn, board computing systems had to
SIMPLIFY YOUR TRANSITION
PCI EXPRESS: THE NEXT GENERATION BUS Sealevel’s PCII Express asynchronous serial boards BSF BWBJMBCMF XJUI QPSUT DPOÜHVSBCMF GPS BSF BWBJMBCMF X X 34 34 BOE 34 TFSJBM JOUFSGBDFT 34 34 Our PCI Expre Express e synchronous serial boards are engineered w with strict attention to timing to BDIJFWF UIF NPTU SFMJBCMF IJHI TQFFE BDIJFWF UIF N UIF N N communications commun un nicatii possible. PCI EEx Express Exp xpress digital xp d I/O boards offer optically isolate ted te ed d inpu u and Reed relay outputs perfect isolated inputs for indu industrial du ustrial us strial t i l applications. trial
Learn more about our PCI Express products by visiting www.sealevel.com/rtc05 /pcie or by scanning the QR code to the left.
28
Untitled-6 1
MAY 2013 RTC MAGAZINE
www.sealevel.cPN t t TBMFT!TFBMevel.com
4/29/13 1:26 PM
technology connected
rely on several different hardware and software solutions for high-speed serial link point-to point connections between boards. Due to the lack of market traction, none of them is expected to survive the next decade. In contrast, PCIe is available in all chipsets such as Intel CPUs and bridges. There is no extra cost in terms of budget or power to use this efficient link versus other technologies that require extra silicon. PCIe native is also here to stay, visible in all modern computer architectures from firstgeneration PCIe in tablets or smartphones, to PCIe 3.0’s wide data path applications in large servers. Convergence is happening, and the same PCIe lanes going out of the board can be used to attach high-speed endpoints such as GPUs or FPGAs, which are already based on PCIe interfaces.
Industrial designers are taking their cue from HPEC applications, with developers focusing on the end application and relying on proven VPX and PCIe 3.0 to create a tenfold increase in I/O bandwidth in a small 3U VPX platform. High-definition medical imaging, video and display interfaces in digital signage, or complex sensor processing from M2M deployments all stand to benefit as well, with faster development, increased
mobility and improved overall rugged performance. Kontron Poway, CA. (888) 294-4558. [www.kontron.com]. PLX Technologies Sunnyvale, CA. (408) 774-9060. [www.plxtech.com].
PCIe 3.0 and VPX Meet LongTerm Embedded Design Requirements
As VPX hardware guidelines are set for up to the next 20 years, OEMs and developers must select ideal communications protocols based on proven VPX standards such as VITA 46 and OpenVPX VITA 64. PCI Express, Gigabit Ethernet, Serial Rapid IO and many others can be used for intra-system communications, yet the challenge for OEMs is to choose an easyto-use, yet fast and low latency communication protocol. PCIe 3.0 has emerged as a highly sensible backplane interconnect solution—enabling a new realm of imageintensive, small form factor applications. Multi-gigahertz signals warrant systems where the full data plane bandwidth is no longer shared between boards. With VPX connectors and backplanes, each board is capable of one or more dedicated 10 Gigabit connections via Ethernet or PCIe. The rugged VPX platform enables PCIe 3.0’s high-speed connections in harsh environments, and VXFabric bridges the gap between this disruptive technology and currently deployed applications exchanging data on Gigabit Ethernet. Applications have top performance in a small envelope and avoid proprietary technologies. TCP/IP applications run unmodified on proven, ruggedized VPX platforms, protecting software investments over long-term deployments. Untitled-18 1
29
2:03:25 PM RTC MAGAZINE 5/2/12 MAY 2013
ploration your goal k directly age, the source. ology, d products
technology in
systems
Data Acquisition with Small Modules
MicroTCA / AMC Solutions for RealTime Data Acquisition MicroTCA has evolved out of the world of ATCA to become its own backplane system definition with a wealth of modules and high-speed interconnects. This makes it possible to develop a system for high-speed data acquisition that can target specific application needs. by Rodger Hosking, Pentek, Inc.
O
riginally designed for high-availability, cost-effective telecom systems, the ATCA architecture using AdvancedMC (AMC) modules has evolved to MicroTCA, an evolving standard now capturing design wins for an increasing share of embedded real-time applications. Based on a well-defined gigabit serial backplane and switch topology, the MicroTCA platform provides a fast, flexible infrastructure nies providing now modules, now available for solutions plug-in AMC ion into products, technologies companies. your goal is to research the latest with functionsand well beyondWhether traditional teleation Engineer, or jump to a company's technical page, the goal of Get Connected is to put you com. High-speed A/Ds, D/As and the latest you require for whatever type of technology, on newfor. AMC modules take advanand productsFPGAs you are searching tage of fast PCIe links to deliver data rates required for demanding real-time data acquisition systems while saving cost over alternative architectures. Figure 1
Real-Time Data Acquisition: Critical Needs for New Markets
End of Article
In the purest sense, real-time data acquisition simply means acquiring data at a specified rate with no loss of data Get Connected
with companies mentioned in this article. www.rtcmagazine.com/getconnected
30
AMC data acquisition module consisting of an AMC XMC carrier and XMC module.
guaranteed. However, diverse markets are driving those rates to exponentially higher levels. In most cases, data acquisition includes digitization of sensor signals followed by transmission, buffering,
MAY 2013 RTC MAGAZINE
Get Connected with companies mentioned in this article.
storing and processing. Communication systems for commercial and military applications struggle to meet user demands for more information including high-definition imaging, video and audio programming, streaming internet content, email traffic, large data files and cloud storage for databases. Radar systems with new wideband waveforms not only detect speed, range and direction of travel, but also capture complex information for target classification and identification. Sonar, medical imaging and security scanner systems are migrating to higher resolution and higher frame rates. All of these factors boost the required bandwidth for each signal channel. But because many of the sensors now have multiple elements, the number of channels is growing as well. As a result, demands on data acquisition hardware and system infrastructure have outstripped older open-architecture embedded systems due to data transfer bottlenecks into and across the backplane. Not only must sensor signals be digitized, but the data must also be
5XJJHG E\ 'HVLJQ 'XDO FRUH UG JHQHUDWLRQ ,QWHO® &RUH™ SURFHVVRU &RUH0RGXOH
8S WR *% VROGHUHG GRZQ ((& 0+] ''5 PHPRU\ 6XSSRUWV ULFK JUDSKLFV SHUIRUPDQFH ZLWK YHUVDWLOH YLGHR RXWSXWV LQFOXGLQJ +'0, 9*$ /9'6 3URYLGHV ERWK 3&,H DQG 3&, FRQQHFWLYLW\ ([WHQGHG 7HPSHUDWXUH °& WR °& WKLFNHU 3&% 0,/ 67' * 0HWKRG % $
;WUHPH *38
;WUHPH *38 )HDWXUHV [ 'XDO 0RGH 0LQL 'LVSOD\3RUW &RQQHFWLRQV 6XSSRUWV 0LFURVRIW® 'LUHFW;® 7HFKQRORJ\ 6XSSRUWV 2SHQ*/ 2SHQ&/ 8S 'RZQ VWDFN FRPSDWLEOH [ [ 3&,H /DQH $FFHVV
* DQG * *60 DQG &'0$ PRGHPV 3UH DSSURYHG UHDG\ WR LQWHJUDWH 0XOWLSOH LQWHUIDFH RSWLRQV 0RGHOV ZLWK *36 WUDFNLQJ FDSDELOLW\ 6LJQLILFDQWO\ VSHHGV XS WLPH WR PDUNHW IRU UHJLRQDO DQG ZRUOGZLGH GHSOR\PHQWV
3$&
8OWUD ORZ SRZHU $50 SURFHVVRU %XLOW LQ /LQX[ 26 ZLWK *18 & & FRPSLOHU [ (WKHUQHW [ 56 DQG [ LVRODWHG 56 ,VRODWHG GLJLWDO LQSXW RXWSXW DQG DQDORJ LQSXW [ 86% +RVW DQG [ PLFUR 6' VRFNHW
The Embedded Products Source ZZZ ZGOV\VWHPV FRP
VDOHV#ZGOV\VWHPV FRP
9LVLW ZZZ ZGOV\VWHPV FRP 45 57& WR VHH PRUH HPEHGGHG SURGXFWV IURP :'/ 6\VWHPV
Tech In Systems
Ethernet
Fabric
GbE Switch
MicroTCA Carrier Hub MCH Fabric MCH Switch Controller
Fabric Power PM#1 Power Module
Management Ethernet
PM#1 Power Module
CU#1 Cooling Unit
CU#1 Cooling Unit
AMC #12
AMC #11
AMC #10
AMC #9
AMC #8
AMC #7
AMC #6
AMC #5
AMC #4
AMC #3
AMC #2
AMC #1
MicroTCA Backplane Interconnect
Figure 2 Typical 12-slot MicroTCA system architecture.
Figure 3 Typical 12-slot MicroTCA shelf.
delivered to useful system destinations such as shared memory, communication links or storage disks. As an example, a four-channel 200 MHz 16-bit A/D converter is a relatively popular configuration for embedded data acquisition in many of the applications above. Operating at the full speed, one such module generates data samples at a rate of 1600 Mbytes/s. Parallel bus architectures like VMEbus or CompactPCI, with peak data transfer capacities across the backplane between 160 and 800 Mbytes/s, are completely overwhelmed by just a single module. This shortfall spurred the development of gigabit serial backplanes and motherboards including standards such as PCIe, VPX, CompactPCI Serial, ATCA, MicroTCA and others.
32
MAY 2013 RTC MAGAZINE
Highlights of MicroTCA and AMC
In 2005, ATCA vendors announced a replacement for the older parallel bus PMC modules found on the first ATCA boards. These new mezzanine modules (AMCs) are daughter cards that add various analog and digital I/O functions to ATCA system boards, including A/D and D/A converters, DSPs, FPGAs, CPUs, network interfaces for copper and optical links, graphics engines and storage interfaces. Offered in six different sizes, they support several different chassis and carrier board shapes. Figure 1 shows the Pentek Model 56660 4-Channel 200 MHz A/D AMC, a single-width, full-height module based on the AMC.1 Specification, which defines PCIe as the backplane fabric interface. It consists of a carrier board housing an XMC module with its
fast PCIe Gen2 x4 interface delivered to the backplane connector. AMC modules offer many advantages directly suitable for real-time data acquisition. They support hot swap capability and intelligent platform management interface (IMPI) features for high availability, built-in testing and performance monitoring for mission critical applications. AMCs offer a rich collection of gigabit serial interfaces including GbE to system controllers, SATA, SAS or FC links to storage peripherals, and XAUI, PCIe or SRIO links for high-speed data transfer through the system connector. As AMC modules rapidly gained popularity, developers sought a way to retarget them as independent plug-in modules for a simple backplane system architecture instead of as daughter cards for ATCA carrier boards. Thus, the MicroTCA architecture evolved. Many of the concepts from ATCA were pulled forward into this new MicroTCA mechanical configuration to take advantage of the wealth of system management, protocol and industry infrastructure already in place for ATCA. The MicroTCA Carrier is an essential aspect of this new architecture. Shown in Figures 2 and 3, it incorporates all elements of a complete system and typically accepts twelve AMC modules. These include a card cage or “shelf� to house the plug-in AMC modules and a backplane that engages with all of the power and signal pins on the AMC connectors. Also connected to the backplane are one or two MicroTCA Carrier Hubs (MCHs), one to four Power Modules (PMs) and one or two Cooling Units (CUs). The MCH includes fabric and Ethernet switches for at least one GbE port plus four lanes of XAUI, PCIe or SRIO to each of the twelve AMC modules. In this way, all of the AMC modules can communicate over Ethernet and send fabric data to each other. The MCH also provides a fabric channel uplink and Ethernet ports to other systems. The MicroTCA Hub Management Controller (MCMC) performs management services for twelve AMCs,
tech in systems
four Power Modules and two Cooling Units. Other MCMC functions include shelf management, clock distribution and alarms.
MicroTCA Carrier Hub (MCH) MCH Fabric Switch - PCIe DMA to SBC Memory
MicroTCA System Example
Because of the fast gigabit serial fabric links, MicroTCA offers an attractive platform for high-speed data acquisition systems. As an example, the Model 56660 AMC data acquiasition module in Figure 1 is quite suitable as the front end of a real-time recording system. The four A/D converters digitize front panel analog inputs, each producing 200 MSamples/ sec. With two bytes per sample, this means 400 Mbytes/s per channel, or 1600 Mbytes/s for the 4-channel AMC module. This rather demanding traffic load can be accommodated through the PCIe Gen 2 x4 backplane fabric interface, which supports a peak transfer rate of 2000 Mbytes/s. Figure 4 shows a simplified block diagram of a complete MicroTCA recording system, showing connectivity of the important PCIe fabric links. A single board computer (SBC) AMC module hosts the operating system and provides system memory accessible through its PCI link to the MCH switch. A RAID Controller AMC module, also connected via PCIe through the MCH fabric switch, offers eight SATA-III ports to eight solid state drives, each capable of read/write speeds of over 300 Mbytes/s. Pentek’s SystemFlow software runs on the SBC to orchestrate real-time data transfers among these three AMC modules by managing hardware direct memory access (DMA) controllers using the PCIe fabric links. Parameters are sent to these linked-list DMA engines to specify the size and destination of data blocks to be moved to or from system memory on the SBC. Once a DMA block transfer is completed, the next DMA operation starts automatically, and the CPU receives a notification interrupt so it can monitor progress. Specifically, the DMA controller on the AMC data acquisition module moves blocks of A/D data into circu-
DMA to RAID
PCIe PCIe x4
PCIe x4
PCIe x4
DMA
Memory CPU
DMA
Operating System
RAID Controller
Single Board Computer AMC Module
RAID Controller AMC Module
A/Ds
Data Acquisition AMC Module Figure 4
PCIe
MicroTCA Backplane Interconnect
SATA
Solid State Drives
Complete 1600 Mbyte/s MicroTCA data acquisition system.
lar buffers on system memory of the SBC. Then, the DMA controller on the RAID controller AMC moves data from completed system memory blocks to the RAID controller. Finally, the RAID controller “stripes” data by writing simultaneously across eight SSDs to achieve aggregate storage speeds to the RAID array of over 2000 Mbytes/s. This scheme ensures that the CPU does not touch any data, so that Windows or Linux host operating systems impose no adverse effect on maintaining sustained real-time recordings. Also, the data on the RAID array is stored in NTFS format so it is immediately available for analysis, display or processing applications running on the host CPU.
MicroTCA: Ready for Real-Time
Its well-defined, fast and straightforward architecture makes MicroTCA a serious contender for high-end, realtime embedded systems. Using popular gigabit serial fabrics like XAUI, PCIe and SRIO, it delivers substantial backplane bandwidth, more than adequate for many applications. It also provides a rich system infrastructure including various form factors, power and cooling strategies, system management facilities, high availability and redundancy features, all in a highly modular design.
Because of the wealth of MicroTCA products available for the cost-sensitive telecom community, systems are often 20 to 40% less expensive than comparable systems using more traditional embedded card cage architectures. Extensions to the base specification now support ruggedized MicroTCA systems with both air- and conduction-cooled AMC modules, including specifications for shock, altitude and vibration required for many government and military applications. System level software written for PCIe or SRIO systems is highly portable to MicroTCA because of the standard fabric interconnects and the operating systems supported by available SBC AMC modules. All in all, MicroTCA is definitely worth considering for your next data acquisition system. Pentek Upper Saddle River, NJ. (201) 818-5900. [www.pentek.com].
RTC MAGAZINE MAY 2013
33
technology deployed Wringing Performance out of Multicore
SequenceL: An Elegant and Efficient Approach to Exploiting the Power of Parallelism Parallelizing complex code efficiently across multiple processor cores gets to be a task beyond human ability. SequenceL is a high-level language that can automatically analyze and output parallel code as C++ and OpenCL to run on a variety of today’s multicore processors. by Doug Norton, Texas Multicore Technologies; Larry A. Lambe, Multidisciplinary Software Systems Research; and Richard Luczak, RL Aerodynamics
I
n 2004 CPU providers made a major shift; rather than increasing the clock speed to increase the performance of each new chip generation, they began to add more processors cores to the chip. Clock speeds had risen to about 4 GHz, and at that speed the resulting leakage current meant they used too much power and gave off far too much heat to be practical for many uses. This was particularly true for laptop and mobile devices, but “green” was increasingly becoming a factor in datacenter environments, particularly in dense cities such as New York. As one datacenter manager shared, “You don’t worry about power and cooling until you run out.” That shift put the challenge on software developers to effectively use these cores. Initial efforts focused on simple partitioning for dual core processors. Next were optimized libraries for computeintensive functions that could run across multiple cores. As core counts increased, so did programming complexity, such that true parallel programming has become necessary, a skill that had been reserved
34
MAY 2013 RTC MAGAZINE
for the most elite of programmers. The latest processors have a heterogeneous mix of cores, with specialized cores such as (GP) GPUs added to traditional CPU cores. This looks good on paper because GPUs deliver outstanding floating point performance per watt. Unfortunately, this once again makes the programming challenge much harder, since not only do they require parallelization, but also a very different—and relatively low level—programming language such as CUDA or OpenCL. The simple fact is humans don’t write large scale parallel code well, nor do they want to keep rewriting it every time a processor evolution occurs. Embedded system developers work in an environment with both time-to-market pressure and high software quality requirements. Adding complicated parallel programming and testing complexity on top of that is typically untenable. To do this faster, easier and without rewrites requires a dramatically better approach. SequenceL does this by allowing programmers to work at a high level, without
regard to execution or performance, and allowing compiler technology to do the difficult, low-level work. Just as hardware engineers have moved up in abstraction to deal with complexity, from drawing gates to Verilog/VHDL and now to SystemC, it is time for software engineers to move up again. And just like some good EDA tools that are “correct by construction,” the two computational laws that underpin SequenceL are provably race free.
Challenges for Programming
An embarrassingly parallel (EP) task is one for which little or no effort is required to separate the whole task into a number of parallel subtasks. It is easy to both conceptualize and implement algorithms for EP tasks; one simply sends the separate subtasks to different CPUs (Cores) and combines the answers appropriately. Unfortunately, most real-world applications are not EP problems, so the challenge is making the majority of applications able to exploit today’s multicore architectures to maximize both performance and energy efficiency. Current approaches such as pthreads, OpenMP and MPI, coupled with tools that analyze code for parallelisms, leave the hard work to the user, and it is complex and tricky work for the majority of programmers. These also pose a major QA challenge to test for race conditions and deadlocks. Clearly these approaches have no hope to scale to the even higher core counts and heterogeneous processors on the vendor’s roadmaps. An example of an EP task is any process q(n) that generates an array of length n with the property that for all 1 <= i <= n, q(i) does not depend on q(i-1). In this case for any chosen partition of 1,
..., n, e.g., 1 = p[1] < p[2] < ... < p[k] = n,
one can calculate the array elements in q(n) by independently calculating the elements on each subinterval p[i] ... p[i-1] and assembling the result (in the proper order) into an array of length n.
Using Posix Threads (pthreads) on a Multicore Machine
Consider a simple special case of the above EP situation. Let f be a function that takes a non-negative integer as
Technology deployed
argument and returns a floating point number. So given n, we would like to generate the array [f(1),...,f(n)]. More generally, we could consider the function, using ANSI C notation
void f(double *arr, int start, int length) that returns the array [f(start), f(start+1) ..., f(start+length)] in arr.
For an example consider a function (in ANSI C notation) of the form
double f(int n);
We want a multicore program for the form
void seq_f(double *ret, int start, int length) which take returns the array [f(start), f(start+1),...,f(start+length)] in ret.
This problem is EP. The first step in programming it is to figure out how many threads to use. We want to use as many threads as there are available cores. For this we can require user input or read the number of available cores num_cor from an environment variable. Pseudo code using a C-like syntax will be used. Let
l1 = length / num_cor; /* most of the subsections */ l2 = l1 + length % num_cor /* last segment with the remaining elements */
Define a structure {
double *seq; /* a pointer to the whole array */ int sub_seq_start; /* the start index for this subsequence */ int sub_seq_len; /* the length of this subsequence */ } ArgVal;
to eventually hold the arguments for the ith subsequence. Now a helper function is needed. That helper function will execute f(arr, start, length) with appropriate input on the ith core. Here it is void doit (void *arg) { /* apply f to the appropriate fields in (ArgVal *) */ }
Figure 1 Example Screenshot Showing SequenceL Eclipse IDE Plug-In. top - 13:56:04 up 17 days, 23:02, 8 users, load average: 1.96, 0.64, 0.30 Threads: 400 total, 9 running, 391 sleeping, 0 stopped, 0 zombie %Cpu0 : 89.7 us, 5.0 sy, 0.0 ni, 5.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st top - 14:00:32 up 17 days, 23:06, 8 users, load average: 2.01, 0.94, 0.48 Threads: 366 total, 9 running, 357 sleeping, 0 stopped, 0 zombie %Cpu0:
94.7 us
4.3 sy
0.0 ni
1.3 id
0.0 wa
0.0 hi
0.0 si
0.0 st
%Cpu1:
94.0 us
4.7 sy
0.0 ni
1.3 id
0.0 wa
0.0 hi
0.0 si
0.0 st
%Cpu2:
94.7 us
4.3 sy
0.0 ni
1.3 id
0.0 wa
0.0 hi
0.0 si
0.0 st
%Cpu3:
93.7 us
5.0 sy
0.0 ni
1.3 id
0.0 wa
0.0 hi
0.0 si
0.0 st
%Cpu4:
95.0 us
5.0 sy
0.0 ni
1.3 id
0.0 wa
0.0 hi
0.0 si
0.0 st
%Cpu5:
94.4 us
4.3 sy
0.0 ni
1.0 id
0.0 wa
0.0 hi
0.0 si
0.0 st
%Cpu6:
95.0 us
4.0 sy
0.0 ni
0.0 id
0.0 wa
0.0 hi
0.0 si
0.0 st
%Cpu7:
94.4 us
4.3 sy
0.0 ni
1.0 id
0.0 wa
0.0 hi
0.0 si
0.0 st
KiB Mem: 8147160 total, 5433952 used, 2713208 free, 420108 buffers KiB Swap: 2104316 total, 0 used, 2104316 free, 2797604 cached
TABLE 1 Results of an 8-core multicore SequenceL performance test.
Of course, the user will have to allocate and instantiate the structure for the ith core using the correct input for this. We omit the simple arithmetic (involving l1, l2 above) here. An array arg of length num_cor is used for this. The code then proceeds as Allocate an array th of length num_cor of threads.
for (i = 0; i < num_cor; i++)
pthread_create(&th[i], NULL, doit, &arg[i]); for (i = 0; i < num_cor, i++) pthread_join(th[i], NULL);
This completes the outline of the pseudo code.
RTC MAGAZINE MAY 2013
35
technology deployed thus working at a high level, without regard to execution or performance, and allowing compiler technology to do the difficult, low-level work. A key focus from the beginning was to maintain a compact language (~15 grammar rules compared to 150+ for JAVA). To that end, the inventors chose not to re-invent I/O, instead relying on C++, which is familiar to most and widely supported across platforms. The output of the SequenceL compiler is robust, massively parallelized C++ and OpenCL, allowing it to easily drop into new and existing frameworks (Figure 2).
SequenceL Interpreter Visual Studio SequenceL IDE Eclipse SequenceL IDE Debugger Integrated Interpreter Develop/ Debug Cycle SequenceL Source Code
In SequenceL, the program is quite concise:
SequenceL Compiler
SequenceL Runtime Libs (Binaries)
Parallelized C++ and OpenCL Source Code
seq_f(n) := f(1...n);
Other Application Code/ Libraries
C++ Compiler/Linker
0100111011001001 100010111100....
Object Code Figure 2 SequenceL products and development flow showing industry standard support Eclipse and Visual Studio IDEs for input, parallelized C++ and OpenCL output.
The SequenceL Language and Compiler
SequenceL may be new to most readers, but it has been in development for more than 20 years. Dr. Daniel Cooke, Dr. Nelson Rushton and Dr. Brad Nemanich worked with NASA over a more than 20 year period while at Texas Tech University, with NASA and other agencies providing over $10 million in research grants. NASA originally wanted to develop a specification language that was easy to read but could also be executed, to eliminate both the ambiguity and the need for software prototyping and its associated
36
MAY 2013 RTC MAGAZINE
costs. They discovered that SequenceL was so readable that programs required no additional documentation, and a plug-in can readably be added to the Eclipse IDE (Figure 1). It was during this development process that the self-parallelizing nature of SequenceL was discovered and enhanced. Texas Mulitcore was formed in 2009 to commercialize SequenceL as an auto-parallelizing software development environment for the multicore computing industry. SequenceL is a Turing complete, domain-independent, functional programming language. It allows engineers to express problems in engineering terms,
The more general program would simply be seq_f(start,length) := f(start... start+length);
The three-dot-operator ... in the above SequenceL codes generates a list of integers ranging inclusively between its two operands. For example, “2...6” creates a list of integers [2,3,4,5,6]. It is easy to see that f(1...n) denotes a list of a given function f evaluated at every point of the list of integers ranging inclusively between 1 and n. For example, “f(1...4)” creates a list [f(1),f(2),f(3),f(4)].
Performance
The performance of the compiled and executed SequenceL code for the above task was tracked using the “top” program on a Linux machine with an eight core Intel multicore chip. The results were as shown in Table 1. The first column in Table 1 shows CPU number (numbering starts at 0). The second column with a number followed by postfix “us,” shows percentage of time spent on the user process; for example, “94.7 us,” indicates that 94.7% of total time on a given CPU is spent on the user process. The higher the number and closer t 100 the better. The third column shows percentage of time spent on the system processes; for example, “4.3 sy,” indicates that only 4.3% of total time on a given CPU is spent on the system processes. The remaining columns show percentage of time spent on other processes such as
Technology deployed I/O completion or hardware and software interruptsâ&#x20AC;&#x201D;in most cases they are zeros or close to zeros. The SequenceL results differ insignificantly from those for the hand-coded pthreads code.
SequenceL and Problems Which Are Not EP
As noted earlier, SequenceL exposes all the parallelisms in code, many of which humans would not see or might not think worth the trouble. We recently were given the task to demonstrate the power of SequenceL on just a two core processor. We searched for an algorithm that was deemed not parallelizable and chose the Barnes-Hut N-body simulation, used to model galaxies where each body exerts physical forces on each other. Unlike the embarrassingly parallel, brute-force, n2 approach, the conclusion in the academic white papers was Barnes-Hut was not parallelizable since each step relies on the previous state. Yet when written and compiled in SequenceL, then run on the two core chip, we achieved 2X the performance once the simulation reached 2000 bodies! When analyzing how this could be,
we saw that SequenceL automatically found all the fine grained parallelisms at each step and they clearly added up. We were only mildly surprised since we have seen many similar results where SequenceL code runs faster than was expected. At the low end, on a one core system, we have seen SequenceL code run faster than the C++ reference code. For a large industrial process control application, the SequenceL implementation achieved a 36X speedup on a 16 core server. Writing parallel code for problems that are not EP is much more difficult and fraught with errors, but not in SequenceL. In each of these non-EP cases, the SequenceL code was written in a fraction of the time of the single-threaded code and worked correctly. In a recent WirelessHART (IEC 62591) customer project, we worked with their team to implement a new mesh networking algorithm in SequenceL to scale the performance to the thousands of nodes needed for real-world environments. Their internal development to that point had been in Java and had taken five months. The SequenceL implementation took just two weeks and
ran five times faster on a two core embedded processor. More surprising was that the SequenceL code got different results. They discovered the SequenceL code was working correctly the very first time, and was used to debug the Java code. But the speed of iteration in SequenceL became even more powerful. We then worked with the customer right in the conference room, doing some very fast iterations in SequenceL to further improve the algorithms. This same effort in a traditional language would have taken many more weeks and still not achieved the performance of SequenceL on a multicore system. Texas Multicore Technologies Austin, TX. (512) 381-1100. [www.texasmulticore.com]. RL Aerodynamics Fort Worth, TX. (817) 246-2585. [www.rlaerodynamics.com]. Multidisciplinary Software Systems Research Bloomingdale, IL. [www.mssrc.com].
Solid or Spin... we go both ways
Ruggedized VPX Drive Storage Module Whatever your drive mount criteria, everyone knows the reputation, value and endurance of Phoenix products. The new VP1-250X, compatible with both solid state or rotating drives, has direct point-to-point connectivity or uses the PCI Express interface with the on-board SATA controller. f controlle It is available in conduction cooled, conduction with REDI covers (VITA 48) and air cooled conďŹ gurations. conďŹ guration Leading the way in rugged COTS data stortechn age technology for decades, Phoenix keeps you on the leading edge with very innovative products!
We Put the State of Art to Work
XXX QIFOYJOU DPN t 714-283-4800 PHOENIX INTERNATIONAL IS AS 9100 REV C / ISO 9001: 2008 CERTIFIED
Untitled-3 1
5/3/13 11:41 Untitled-1 AM 1
RTC MAGAZINE MAY 2013
37
2/28/13 9:52 AM
products &
TECHNOLOGY FEATURED PRODUCT
Browser-Based Unit Brings Scalable HMIs to PCs, Smartphones, Tablets and Other Mobile Devices A new system provides a way to build, deploy and view simple, effective and scalable operator interfaces to monitor and control systems and equipment using computers and mobile devices. Using only a modern web browser, groov from Opto 22 securely lets industrial automation end-users, system integrators, machine OEMs, building managers, technicians, or any authorized person quickly build and deploy browser-based interfaces for automation, monitoring and control applications. These operator interfaces can then be viewed on almost any computer or mobile device regardless of its manufacturer or operating system, including PCs, tablets, smartphones, and even smart high-definition televisions. groov reduces the time, complexity and cost usually associated with mobile HMI development by operating completely within a modern web browser and running on a secure and industrially hardened network appliance. groov offers a simple yet flexible environment for developing operator interfaces with zero programming, and requires no per-seat runtime or viewing licenses. Overcoming the biggest challenge in developing for multiple screen sizes, groov automatically and gracefully scales all screens, page objects and gadgets, allowing groov HMIs to be viewed and manipulated from virtually any device of any screen size. groov works with modern web browsers like Internet Explorer, Firefox, Chrome, Safari, or Opera running on operating systems including iOS, Android, Microsoft Windows, Mac OS and Linux. groov benefits from the capabilities of these browsers by using the latest web standards like HTML5, CSS3 and SVG. The heart of the groov system is a secure industrial appliance called the groov Box, which runs groov software. All network communication between a web browser and the groov Box uses an encrypted secure sockets layer (SSL) over an HTTPS connection. The groov Box does not respond to any other communication methods on any other ports. groov connects to Opto 22 SNAP PAC automation systems and OptoEMU energy monitoring products over a separate and segmented wired or wireless Ethernet network, adding a secure barrier for control systems. Support for the OPC-UA protocol is planned in 2013 and will allow groov to communicate with systems from other manufacturers that offer an OPC-UA server. The simple and flexible development environment, groov Build, dramatically reduces the time needed to build interfaces when compared to traditional HMI screen building tools. groov Build includes a library of scalable, touchscreen-ready gadgets: gauges, buttons, range indicators, text entry, sliders and trends. Images and real-time video from network IP cameras—also fully scalable—can also be added. Designed to support HMI best practices, groov Build includes the tools necessary to build high-performance, intelligible information and control screens. Opto 22’s groov has a suggested list price of $1,995. Opto 22, Temecula, CA. (951) 695-3000. [www.opto22.com].
38
MAY 2013 RTC MAGAZINE
ZigBee Transceivers Add Wi-Fi Coexistence Schemes, Deep Packet Inspection and Wake-onLAN Features A new generation of ZigBee transceivers contains an advanced coexistence scheme that allows Wi-Fi, Bluetooth and ZigBee chips to work side by side in the same device. The GP501 from GreenPeak Technologies also contains deep packet inspection allowing deep sleep modes of set-top boxes and other host devices by means of Wake-on-LAN messages. ZigBee shares the 2.4 GHz frequency band with other Wi-Fi equipment. The GP501 has a coexistence interface to allow optimized and co-located ZigBee/Wi-Fi radios to work in the same device, successfully avoiding RF interference when operating simultaneously. This coexistence interface enables arbitration over the shared radio frequency medium to prevent contention, signal degradation and data loss. Another advantage of the GP501 is its small size: its 32-pin 5x5 mm2 footprint allows integration into even the smallest product form factors. A new key feature of the GP501 ZigBee transceiver chip is the deep packet inspection (DPI) for ZigBee applications. DPI enables advanced packet management, allowing the host processor to go into a deep-sleep mode to conserve power. While most other ZigBee transceiver chips only include a superficial inspection of the MAC and PHY headers, the GP501 looks beyond these layers to execute DPI, and based on the outcome, the chip can decide if the packet has to be passed on to the higher layer application or can be ignored. The DPI engine is also security aware, blocking unauthorized packets without involving the host processor and ensuring the system does not waste energy analyzing non-compliant packets. The DPI feature can be used for Wake-on-LAN functionality, where ultra-low-power ZigBee is used to wake up the main processor from its sleep mode to enable Wi-Fi networking. GreenPeak Technologies, San Ramon, CA. (925) 230-6844. [www.greenpeak.com].
PRODUCTS & TECHNOLOGY
Type 10 Nano COM Express Module for Low Power in Tight Spaces A solution is being offered for users with a Type 10 carrier board who want to improve performance, find a smaller footprint for applications in tighter spaces and operate smoothly in temperatures that can range between -40° and 80°C. At 84 mm x 55 mm and driven by Intel’s Dual-Core Atom N2800/2600 processor, the new PCOM-B21A from American Portwell fills the restricted space needs, and also provides improved graphics support at less than 12W for a fanless solution. PCOM-B21A also includes DDR3 memory up to 4 Gbyte (N2800) and 2 Gbyte (N2600); four PCIe x1 lanes that can be configured to one PCIe x4; eight USB 2.0 ports; dual independent displays via VGA, LVDS and DisplayPort (DP); a tested operating temperature range of -40° to 80°C (N2600); and an optional onboard SSD to provide fast operating speed and maintain the small footprint. The new PCOM-B21A Type 10 COM Express module is targeted for applications in fields such as automation, medical, military, networking, outdoor digital signage and transportation. One of the key benefits is that it supports all Type 10 carrier boards and extends their operating functionality into a much wider temperature range. American Portwell, Fremont, CA. (510) 403-3399. [www.portwell.com].
Multiprocessor with GPGPU Delivers Performance, Functionality in Less Space A new 6U OpenVPX multiprocessor board brings three key benefits to defense prime contractors and systems integrators. First, the IPN251 from General Electric Intelligent Platforms responds to the growing demand to provide more functionality within a constrained size, weight and power (SWaP) envelope. The massively parallel nature of general purpose computing on a graphics processing unit (GPGPU) technology allows larger numbers of more sophisticated algorithms to be processed in a single chassis slot than is possible with conventional computing. Second, by allowing more processing to take place on a vehicle, it means that the platform is less dependent on the external network, and can become more self-sufficient, which also frees valuable network bandwidth for other applications. It also allows the vehicle’s sensor capability to be expanded enormously, as processing of sensor-derived data can take place on board. Third, it reduces time-to-market and time-to-revenue for systems integrators and prime contractors, while minimizing risk and cost. The IPN251 does this by harnessing proven “best practice” from commercial high performance computing. By using open, industry hardware and software standards that are well understood and that benefit from a substantial support infrastructure, the IPN251 allows development time to be significantly reduced. The IPN251 combines the latest 384-core NVIDIA Kepler GPGPU technology with a third generation Intel Core i7 quad core processor to deliver outstanding computing performance in a wide range of demanding data-intensive applications, particularly intelligence, surveillance, reconnaissance (ISR). The IPN251 offers substantially improved performance not only because of the more powerful NVIDIA technology it incorporates, but also through its support of GPUDirect, which delivers superior data transfer times with the lowest latency and maximum throughput. Onboard and backplane communication is maximized by the use of PCI Express Gen 3 components, including the dual-channel Mellanox ConnectX-3 10 Gigabit Ethernet/InfiniBand network interface card. The third generation Intel Core i7 CPU and the GPU are connected via a 16-lane PCI Express Gen 3 switch. General Electric Intelligent Platforms, Huntsville, AL. (780) 401-7700. [http://defense.ge-ip.com].
Major Updates to Leading Tools for Low-Power Renesas MCUs A new version of the development tool suite for Renesas RL78 MCU adds a large amount of new functionality for code writing and debugging. Among the improvements to the IAR Embedded Workbench from IAR Systems is the new text editor and source browser, which includes userfriendly features such as autocompletion, parameter hints, code folding, block select and indent, bracket matching, zooming and word/paragraph navigation. Also added is project connections functionality for automated integration with device configuration tools. This makes it possible to import files or file packages generated by such external code generation tools and enables IAR Embedded Workbench to automatically detect changes in the generated file set. New in the comprehensive C-SPY Debugger is the possibility to connect an E1 or E20 emulator to a running system to inspect it without interrupting program execution. Also introduced are Sampled Graphs and several new windows. The Sampled Graphs allow you to specify variables for which you want to collect data samples. You can view the sampled data either in table format in the Data Sample window or as graphs in the Sampled Graphs window. A new Custom SFR window lets you define custom special function registers (SFRs) in C-SPY with selectable access size and type, while the Call Graph window displays all calls made to and from each function from any source file in the active project. The new Macro Quicklaunch window makes it possible to evaluate expressions and to launch C-SPY macros. Thanks to the longstanding close relationship between IAR Systems and Renesas Electronics, IAR Embedded Workbench was the first set of tools available for developing the low-power RL78-based microcontrollers. Since the first release in March of 2011, IAR Systems has made several updates and improvements to the tools, including extraordinary optimizations for code size and speed. The new version brings even further improved optimizations that make code generated for floating-point operations even faster. IAR Systems, Foster City, CA. (650) 287-4250. [www.iar.com].
RTC MAGAZINE MAY 2013
39
PRODUCTS & TECHNOLOGY
Desktop Expansion Enclosure Connects with Either Thunderbolt or PCIe
40G AdvancedTCA Backplane Boosts Density and Bandwidth
An expansion enclosure offers Thunderbolt or PCIe expansion by supporting a single PCIe x8 Gen3 short card, allowing the addition of greater functionality to laptop or workstation. The lightweight nanoCUBE appliance from One Stop Systems is an attractive companion to a PC or Workstation for adding a special I/O card today that there isn’t room for in a system. It can easily be disconnected later. For example, adding a video editing card to the nanoCUBE creates a portable video editing appliance. Lightweight and whisper-quiet, it’s ideal to accompany your laptop to field locations that may be noise-sensitive environments. The nanoCUBE with Thunderbolt expansion lists for $450 and the nanoCUBE with PCIe x8 expansion lists for $625. Both are available on-line at maxexpansion.com.
New 40G ATCA backplanes available in 6-slot and 14-slot sizes from Pixus Technologies are designed to PICMG 3.0 Rev 3.0 specifications. The 6-slot version features dual pluggable shelf manager/switch connections. By combining the shelf managers and switches, users get a full six payload slots versus four payload slots when switches occupy two board slots. The result is that the user gets 50% more computing density in addition to a 400% increase of data rates from previous 10G AdvancedTCA shelves. Pixus Technologies’ 14-slot 40G backplane comes with pluggable power entry module (PEM) options. Slot 0 of the backplane allows dual shelf managers to be plugged into the chassis without taking up any of the 14 slots. The 40G ATCA backplanes feature 18-22 layers in FR-4, Nelco4000-13SI, or other laminates. Pixus also offers 10G ATCA backplanes for both vertical-mount and horizontal-mount shelves. Pricing starts under $2,000 depending on quantity and configuration.
One Stop Systems, Escondido, CA. (877) 438-2724. [onestopsystems.com].
Pixus Technologies, Waterloo, Ontario. (916) 524-8242. [www.pixustechnologies.com].
Encryption Engine Secures ASIC-Powered Devices A compact, fast cryptographic engine delivers high-performance security and extends secure connectivity for resource-constrained, smallfootprint processors to low-cost, high-volume ASIC-powered M2M devices. Version 2.5 of SharkSSL from Real Time Logic targets devices commonly used in large-scale networks for municipal utility monitoring, medical record transmission, secure building access and monitoring, and smart grid frameworks where secure message passing is essential. Although transport layer security (TLS) has become the de facto standard for information and communication standards at the desktop and enterprise level, in a February 11, 2013 press release, ABI Research noted the “porous security” of M2M applications threatened to “throttle the successful adoption of M2M in healthcare, industrial installations and consumer homes.” SharkSSL addresses this gap, bringing full end-to-end security to device communications with proprietary software that enables developers to optimize their security implementation for size or speed. In integrating TLS 1.2 into SharkSSL, Real Time Logic implements the stronger cryptographic algorithm, improved encryption and superior message authentication proven to secure TCP/IP communications at the enterprise level into the small, resource-constrained footprint of an embedded device. SharkSSL implements the new Secure Hash Algorithm-256 (SHA-256), which replaces outdated hash functions and can be used to verify the integrity of copies of the original data without compromising the source. The latest version of SharkSSL also supports the internationally adopted Advanced Encryption Standard (AES) and Galois/Counter Mode (GCM)—technologies that combine message encryption and authentication into a single function that can be transferred at high throughput rates by taking advantage of the parallel processing of the architecture. Optimized to take advantage of encryption acceleration, SharkSSL achieves high throughput on ColdFire, Kinetis K60 and all the Cortex-M3 and -M4 processors. Available as source code, SharkSSL code can be implemented on any processor off the shelf. The SharkSSL library has been successfully deployed on ARM, Freescale and PowerPC-based FPGA architectures. Other processors and accelerators can be accommodated upon request. Out-of-the-box operating system (OS) support includes Integrity, MQX, SMX, ThreadX, VxWorks, EBSnet, rtplatform, uCLinux, Linux and Windows. It can also be used in bare-metal (no OS) configurations. Multi-threading is available for added performance when using an OS that supports multi-threading. SharkSSL V2.5 comes with full source code and royalty-free licenses starting at $8,000. Real Time Logic, Monarch Beach, CA. (949) 388-1314. [www.realtimelogic.com].
40
MAY 2013 RTC MAGAZINE
Solid-State Drives & Industrial Box PCs Showcase Featuring the latest in Solid-State Drive & Industrial Box PC technologies NEW! ADLMES-8200 Modular Enclosure System
Slim 7P SATA module (SDM4 7P/180D Slim)
Modular Design Supports Variable Stack Heights (2 - 6 Cards) Three Basic Size Profiles Available To Reduce Time To Market Front I/O Plate Can Be Easily Customized For Feature and Function High and Low Ingress Protection (IP) Systems Thermally Conductive Base, Ribbed Sidewalls and Finned Top For Superior Conductive and Convection Cooling
ADL Embedded Solutions Inc. Phone: (858) 490-0597 Fax: (858) 490-0599
E-mail: sales@adl-usa.com Web: www.adl-usa.com
Built-in hardware ECC, enabling up to 16/24 bit correction per 1K bytes Static wear-leveling scheme together with dynamical block allocation to significantly increase the lifetime of a flash device and optimize the disk performance Flash bad-block management Power Failure Management ATA Secure Erase S.M.A.R.T.
Apacer Memory America Inc. Phone: (408) 518-8699
Removable MediaPac Solid State Storage
Desktop SilverStor 8-bay PCIe extender RAID 8-bay PCIe extender RAID
Attach state-of-the-art technology to legacy systems Ruggedized multi-insertion connector Up to 1TB capacity (MLC & SLC) > 200MB/s data transfer rate Commercial, industrial, military temperature Secure & Destructive Erase Easily interface to SATA, IDE, SCSI, & USB systems
Audavi Corporation Phone: (877) 947-0830 CAGE: 3DJV2
With its PCIe Extension, the SilverStor™ provides three internal PCIe slots for RAID controllers and other PCIe peripherals to be installed directly within the SilverStor™ desktop system, freeing up slots in your host.
JMR Electronics, Inc. E-mail: info@audavi.com Web: www.audavi.com
Phone: (818) 993-4801
Desktop ATX Work Station ATX Workstation
JMR Electronics, Inc.
E-mail: ussales@jmr.com Web: www.jmr.com
BC50I: Rugged and modular industrial box computer
Configurable with up to 12 cores and a 3.8Ghz Intel Xeon i7 processor, the SilverStor™ ATX Workstation is a powerful, fully customizable purposebuilt compact workstation designed for demanding high performance applications.
Phone: (818) 993-4801
E-mail: ssdsales@apacerus.com Web: usa.apacer.com
MEN Micro’s BC50I uses the low power AMD processor, combining high computing capabilities with integrated graphics functionality. The BC50I can be used as an independent unit or designed with a display. The unit is fanless and maintenance free.
MEN Micro E-mail: ussales@jmr.com Web: www.jmr.com
Phone: (215) 542-9575 Fax: (215) 542-9577
E-mail: sales@menmicro.com Web: www.menmicro.com
PRODUCTS & TECHNOLOGY
Extremely Rugged CPU Module for Highest Reliability in Harsh Environments An extremely small and rugged Computer on Module (COM) board is based on the industry-standard COM Express mini form factor (55 mm x 84 mm). About the size of a credit card, the VL-COMm-26 from VersaLogic is a highly integrated embedded computer that combines the Intel Atom E6x0T low-power processor with an optional TPM security chip and a Type 10 COM interface. It is designed for high reliability in demanding environments that include extreme temperature, impact and vibration. The VL-COMm-26 is available manufactured to IPC-A-610 Class 2 standards. As a first in the industry, Class 3 assembly is also available where extreme reliability is required. Designed and tested for industrial temperature (-40° to +85°C) operation, the rugged VLCOMm-26 meets MIL-STD-202G specifications to withstand high impact and vibration. Soldered-on RAM (up to 2 Gbytes) and fanless thermal solutions provide additional protection within harsh environments. Thermal monitoring technologies reduce power consumption to remain within specified operating limits to protect the processor and increasing field reliability. The wide input voltage range (8 to 17 volts) of the VL-COMm-26 simplifies system power supply requirements and is fully compatible with nominal 12V automotive power systems. The VL-COMm-26 features an Intel Atom E6x0T processor that strikes a balance between performance and power consumption. The E6x0T provides compatibility with a broad range of x86 application development tools for reduced cost and accelerated development time. Advanced Intel technologies, including Intel Hyper-Threading, Intel Virtualization and Enhanced Intel SpeedStep, maximize processor performance. Integrated high-performance graphics provide hardwareaccelerated MPEG-4/H.264 and MPEG-2 video encoding and decoding. A standard LVDS video output supports flat panel displays, and an SDVO output supports a variety of video signaling interfaces, including VGA and DVI. For enhanced security, the VL-COMm-26 supports Execute Disable Bit functionality to reduce exposure to viruses and malicious code attacks. An optional onboard Trusted Platform Module (TPM) chip is available for applications that require enhanced hardware-level security functions. The standard Type 10 pin-out provides industry-standard system interfaces, including Gigabit Ethernet, seven USB ports, three x1 PCIe lanes, two serial interfaces, Intel High-Definition Audio (HDA), LPC and SMBus to the carrier board. An auxiliary board-to-board connector provides two additional serial interfaces and a CAN interface. Dual SATA 3 Gbit/s interfaces support high-capacity rotating or solid-state drives. A microSD socket and SDIO interface provide flexible solidstate drive (SSD) options. Customization options include TPM chip, conformal coating, IPC Class 3 assembly, BIOS modifications, application-specific testing, BOM revision locks and special labeling. Pricing starts at $526 for 1 Gbyte RAM models in OEM quantities. VersaLogic, Tualatin, OR. (503) 747-2261. [www.VersaLogic.com].
Solid-State Drives & Industrial Box PCs Showcase Microsemi TRRUST-Stor™ SSDs The Most Secure SSDs on the Market Trusted IC Provider Up to 512GB Ruggedized Storage Fast Erase, AES-256 Encryption High Reliability, Advanced Performance Unparalleled Protection for Sensitive Data Visit us at: www.microsemi.com/pmgp
Microsemi Corporation Phone: (602) 437-1520 Fax: (602) 437-1731
Web: www.microsemi.com
USB Wi-Fi Modules 802.11b/g/n Compliant
Radicom Research, Inc. Phone: (408) 383-9006 Fax: (408) 383-9007
USB 2.0 hot swappable interface Compatible with USB1.1 and USB2.0 host controllers Up to 300Mbps receive and 150Mbps transmit rate using 40MHz bandwidth Up to 150Mbps receive and 75Mbps transmit rate using 20MHz bandwidth 1 x 2 MIMO technology for exceptional reception and throughput 2 U.FL TX/RX antenna ports Wi-Fi security using WEP, WPA and WPA2 Compact size: 1.0” x 1.0” x 0.25” (Modules) Windows 2K, XP, Vista, Win7 support Linux 2.4/2.6 support RoHS compliant
E-mail: sales@radi.com Web: www.radi.com
HiDANplus – PCI Express, PCI & ISA Modular Box Systems -40 to +85°C operating temperatures Rugged, watertight, stackable system Compatible with RTD’s complete product line of stackable PCI Express, PCI, and ISA modules Modular design enables fast, costeffective field serviceability and capability for upgrades Heat fins and Advanced thermal transport technology options EMI suppression & RF isolation
RTD Embedded Technologies, Inc. E-mail: sales@rtd.com Phone: (814) 234-8087 AS9100 & ISO 9001 Certified Web: www.rtd.com
rtc1305_scv8.indd 2
RTC MAGAZINE MAY 2013
43
5/3/13 9:52 AM
INDUSTRY
WATCH
Power Consumption vs. Performance
Improving Performance through Power Management and Workload Consolidation in Telecommunications A constant battle being fought by embedded developers is the balance between designing for low power consumption and designing for high performance. Power management and workload consolidation are two areas of technology focus that have emerged to help vendors address these issues. by Li Jun, Adlink Technology
T
here were no surprises in a recent report released by the International Energy Agency (IEA) announcing that energy consumption is steadily rising and will continue to do so in the long term. The report estimates that global energy consumption will rise by 2.5% annually through 2015, with fossil energy continuing to play a dominant role. And while much of this increased consumption can be attributed to lifestyle advancements in developing countries, the role of first world industry continues to contribute to the world’s dwindling energy supply. According to the annual reports of leading telecommunications operators, the industry is a heavy contributor to the trend in rising energy consumption, with some operators listed as the largest energy-consuming companies in their respective countries. These companies continue to introduce complex information and communications technologies, leading to an ever-increasing volume of peripheral equipment running online and an increased demand on the world’s energy supply. As a result of this increased demand, both CO2 emissions and energy
44
MAY 2013 RTC MAGAZINE
costs have risen in parallel, putting operators under long-term financial pressure to reduce their consumption in order to meet corporate social responsibility requirements and/or federal regulations, as well as improve their bottom lines. The continued increase in data demand and faster transfer rates lead to an increasing need for speed provided by communications equipment, which in turn amplifies the overall power consumption of the telecommunications industry. Understanding and investing in power management has never been more important, which has led telecommunications operators and equipment vendors to address the need for emission reduction and to focus on developing energy-efficiency plans based on sustainable development. In the lifetime of an AdvancedTCA (ATCA) chassis used in a networking deployment, the majority of CO2 emissions are attributed to unit performance and cooling required for heat dissipation. The most energy is consumed during the operational phase, in which CO2 emission accounts for about 80% of total emission over the life of the product. During the
operational phase, there are three levels— support facilities, network equipment and power conversion—where power is consumed and can be managed (Figure 1). Energy efficiency begins by understanding the relevant technologies that can be used to manage that power consumption. Reasonable design is essential to thermal management. By reducing CPU usage, power supply output is also reduced, which further reduces the cooling needs within the equipment room. The end result of this cascading reduction is both decreased CO2 emissions and decreased costs associated with cooling due to diminished energy consumption.
Power Management Concepts and Technologies
In terms of the equipment itself, there are several concepts that contribute to reduced energy consumption. Perhaps the best known is processor level dynamic power management, which occurs when a device or system is set into different running modes such as performance, ondemand, power save or emergency. With this technology, dynamic voltage scaling
DESIGN 50 AUTOMATION CONFERENCE
50 YEARS OF
INNOVATION
AUSTIN, TX
Design Automation Conference
ye
i
n
50
te
xas
Be a
C
P
t of the dA ar
JUNE 2-6, 2013
a r s! Â&#x2021; Au s t
REGISTRATION OPENS:
MARCH 29
Where IC Design and the EDA ecosystem learns, networks, and conducts business. DAC is the only conference focused on Electronic Design and Embedded Systems and Software (ESS).
DAC delivers: t "O FYDJUJOH UFDIOJDBM QSPHSBN PO &MFDUSPOJD %FTJHO Automation and Embedded Systems & Software (ESS) t %BJMZ ,FZOPUFT QSFTFOUFE CZ 5PQ *OEVTUSZ &YFDVUJWFT t .BOBHFNFOU %BZ 5IF &EHF PG #VTJOFTT BOE 5FDIOPMPHZ t %FTJHOFS 6TFS 5SBDL QSFTFOUBUJPOT CZ BOE GPS VTFST Sponsored by t $PMPDBUFE $POGFSFODFT 5VUPSJBMT BOE 8PSLTIPQT t 0WFS &YIJCJUPST JODMVEJOH NEW *OOPWBUJPO 4RVBSF 5IF "3. $POOFDUFE $PNNVOJUZ® $$ 1BWJMJPO
BEST REASONS TO ATTEND!
stay up to date:
Sponsored by:
DAC.COM In technical cooperation with:
INDUSTRY WATCH
Average Power Consumer
Power Conversion 11%
Network Equipment 36%
Facility (Cooling, Light) 53%
Figure 1 Only about 36% of power consumption is used for network equipment, such as servers, storage and network devices, with most of that power going toward heat production. About 2.4% of total power input is really used for effective output. Todayâ&#x20AC;&#x2122;s vendors offer solutions to improve energy efficiency in ATCAbased network equipment, which also results in improved energy efficiency in equipment facilities and the power conversion process.
Figure 2 Example of power-capping function.
46
MAY 2013 RTC MAGAZINE
and dynamic frequency scaling are used to obtain efficient power management for the processor. With dynamic voltage scaling and dynamic frequency scaling, the processor core voltage, clock frequency, or both can be reduced to decrease power consumption in real time to meet system performance requirements. Powercapping refers to the ability of a system or component to keep its peak power usage below a defined limit by policy-based strategy according to a real service model, including the raw data of CPU usage, concurrent session number, and so on. ATCA shelf-level power management policies that include virtualization live migration for load consolidation also reduce power usage and related cost/expenses. Live migration allows the server administrator to move a running virtual machine (VM) or application between different physical machines (PMs) without disconnecting the client or application. One of the primary use-cases for live migration is for resource management in cloud computing, with telecom providers who have thousands of virtual machines (VMs) running in their data centers. To save energy and cost, and for load-balancing, these providers can move VMs using live migration without disrupting customer applications running in the VMs. Setting policy for live migration can be based on an energy-aware migration model and/or a load-dispatching model, guided by whether primary goals are for energy savings or quality of service levels. The key to energy savings with live migration is to efficiently pack service into fewer physical servers, thus providing considerable energy savings by reducing the amount of physical servers requiring power and producing heat. While live VM migration brings multiple benefits, such as resources (CPU, memory, etc.), distribution and energyaware consolidation, the migration of virtual machines itself requires extra power consumption. According to a paper on performance and energy modeling for live migration of virtual machines published in proceedings from the 20th International Symposium on High Performance Distributed Computing, tests run to determine power consumption during live migration show that the power over-
INDUSTRY WATCH
160 140 120
3 2.5
100
kw 2
80
1.5 1
60
24H, PowerConsumption powersave
Active PoweMng
40
performance
20 0
Figure 3
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
powersave
Active PoweMng
performance
Comparison of 24-hour power consumption of a CPU in three separate states (left), and the resulting power consumption broken down over the 24-hour period (right).
head of migration is greatly reduced when employing the energy-aware, server consolidation model. Model-guided decisions significantly reduced the migration cost by more than 72.9%, at an energy saving of 73.6%. Taking the telecom industry as an example, today’s ATCA chassis often include a set of high-quality power modules and intelligent fan systems that can be used relationally to control temperature output and power consumption. Based on tests run by Adlink on a typical ATCA chassis, the power consumption of fans (1/8 of total chassis) can be reduced by 40% with automated policies for variable fan speed based on ambient temperature. For the remaining portion (7/8) of the chassis, embedded software can be used to set the frequencies and operating mode of the CPU, memory and devices on every blade in the chassis in order to achieve dynamic power management and/or powercapping. With the added intelligence in the firmware and control at the software level, power management policies can be put into place to greatly reduce consumption. From a systems management perspective, dynamic power management might be done on a scheduled basis when the system workloads are known to be well below the system’s full capacity. It might also be used to reduce energy consumption in peak periods. When power (energy) saver mode is enabled, however, the reduction in proces-
sor frequency may affect workload performance and throughput. Power-capping can be done with internal or external processing of monitors and actuators. The actuators might scale processor voltages or scale processor or memory frequencies. The actuators might also “throttle” the processor, which delays instruction processing by injecting dead cycles. When power cap limits are reached and capping techniques enabled, performance of workloads may be impacted.
Embedded Power Management
The topology for power management software is to have multiple system-daemon components, each of which manages one blade and one client component. The client collects power-related data on behalf of the power-managed systems. The system-daemon is an application located on each blade that acts as a power management module. It provides CPU, memory, hard disk, network and virtualization work methods and power-capping functionality to meet performance requirements with the lowest possible power consumption. The actual management client can run on a desktop or laptop. It consolidates and displays the chassis, board and sensor (e.g., temperature) information, as well as actual power consumption (Figure 2). Setting a policy to switch the CPU of an ATCA blade into power save or active power management mode can reduce consumption up to 15% on each blade as
compared to continuously running in performance mode (Figure 3). 0.4 KW power consumption could be saved in each blade in 24 hours under a service load (Figure 3a). Assuming there are 10 service blades in a 14-slot ATCA system, then in total there would be 4 KW energy saved every day. A very powerful method to reduce power consumption is to use only the necessary equipment to handle events. Using Erlang probability distribution (Figure 4), phases of lower usage can be detected. In the Erlang example, usage is very low during the hours of 1-7. However, the individual blades still consume power, as they are running in power save mode. In this case, each blade consumes 90W in active power management and up to 140W during peak performance. The solution is to use policy-driven live migration to concentrate the workload on the minimum number of CPU blades, and send the ones that are in power save to a sleep state to achieve an additional 25% power savings over active power management mode.
Boosting Performance through Workload Consolidation
In terms of workload and I/O handling, there has been a market and technology trend toward the convergence of network infrastructure to a common platform or modular components that support multiple network elements and functions, such as application processing, control processing, packet processing and signal RTC MAGAZINE MAY 2013
47
INDUSTRY WATCH
IP forwarding performance achieved using the DPDK. The data plane development kit provides a lightweight run-time environment for x86 processors, offering low overhead and run-to-completion mode to maximize packet processing performance. The environment provides a rich selection of optimized and efficient libraries, also known as the Environment Abstraction Layer (EAL), which controls low-level resources and provides optimized driver (Poll Mode Driver PMD) as well as full APIs for integration with higher level applications. The software hierarchy is shown in Figure 5.
Erlang
80% 70% 60% 50% 40% 30% 20% 10% 0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Figure 4 Example of Erlang probability distribution being applied to telecommunications network traffic. Customers’ applications X-Linux Kernel Space Linux
Stacks (RIP/OSPF)
Adlink 0 Virtual Eth Device
API
OS (x-Linux)
System Call
OS (x-Linux) ADLINK DPDK Toolkit
User Space
socket
To control plane • OS/Stack Interfaces • Virtual Device Mng • Route Mng • NAT feature
Adlink x
flow-classification
Port x
Forwarding
Port 0
Intel IA platform (ADLINK aTCA-6200/6100/6900) Eth 0
Eth x frames
Figure 5 EAL and GLIBC in a Linux Application Environment.
processing. Enhancements to processor architecture and the availability of new software development tools are enabling developers to consolidate the workloads on unified blade architecture for all their application, control and packet processing workloads. Huge performance boosts achieved by this hardware/software combination are making the processor blade architecture increasingly viable as a packet processing solution. To illustrate the workload consolidation evolution, we developed a series of
48
MAY 2013 RTC MAGAZINE
tests to verify that an ATCA processor blade combined with a data plane development kit (DPDK) supplied by the CPU manufacturer can provide the required performance and consolidate IP forwarding services with application processing using a single platform. In summary, we compared the Layer 3 forwarding performance of an ATCA blade using native Linux IP forwarding without any additional optimization from software with that obtained using the DPDK. We then analyzed the reasons behind the gains in
Test Topology
In order to measure the speed at which the ATCA processor blade can process and forward IP packets at the Layer 3 level, we used the following test environment shown in Figure 6. By analyzing the results of our tests using the ATCA processor blade’s two 10GbE external interfaces and two 10GbE Fabric Interfaces (total 40G capability) with and without the data plane development kit, we can conclude that running Linux with the DPDK and using only two CPU cores for IP forwarding can achieve approximately 10 times the IP forwarding performance of that achieved by native Linux with all CPU threads running on the same hardware platform. Using the DPDK platform makes it possible to achieve greater than 70% line rate loading in small packet layer 3 forwarding. Highly optimized software stacks in DPDK enable a 10x performance boost. With control and data plane in a single IA blade with DPDK enablement, it would eliminate one NPU blade with 40G throughput. Normally the power consumer of 40G NPU blade is 180W, then about 56% could be saved by workload consolidation. As is evident in Figure 7, the IPv4 forwarding performance achieved by the processor blade with the DPDK makes it cost- and performance-effective for customers to migrate their packet processing applications from network processor based hardware to x86 based platforms, and use a uniform platform to deploy different services, such as application processing, control processing and packet processing services. Additional details
INDUSTRY WATCH
Figure 6 IP Forwarding Test Environment.
IPv4 L3 Forwarding Performance of Linux and DPDK 45000000 40000000 35000000 30000000 25000000 20000000 15000000 10000000 5000000 0
100%
100%
100%
100% 98.9%
42093496 91.6% 80.5%
90% 80%
30954056
70.7%
100%
70% 60% 50% 41.0%
18114924
9397968 3868948 6.5%
64
3851348 11.4%
128 fps of linux
3858692
21.3%
256 fps of DPDK
3853132
512 bps of linux
40% 4789004 3855340
30% 3250796 3214280
1024 1518 bps of DPDK
Linear rate(%)
Frames per second
(ADLINK aTCA-6200 4 X 10 GE interfaces, 4 Threads(2 cores))
20% 10% 0% Packet Size(Byte)
Figure 7 IP Forwarding performance comparisons using 4x 10GbE.
about our test procedure and results can be found in our white paper, Consolidating Packet Forwarding Services on the ADLINK aTCA-6200 Blade with the Intel DPDK. There are multiple ways to optimize power usage and power efficiency of a multi-board/multi-processor system. We have seen the possibilities using embedded power management, live migration combined with embedded power management and workload consolidation with throughput optimization. Since the system configurations and workload demands vary case-by-case, there is no ge-
neric solution. For each scenario, the techniques and policies to achieve the desired throughput and power consumption must be selected carefully. In the future, power management will remain an important factor for telecommunication operators, as the power density (watt/cubic inch) per system will continue to increase and, with that, intensify the impact on cooling and operational expense. ADLINK Technology San Jose, CA. (408) 360-0200. [www.adlinktech.com].
Untitled-6 1
RTC MAGAZINE MAY 2013
49
4/23/13 3:58 PM
with an Application Engineer, or jump to a company's technical page, the goal of Get Connected is to put you in touch with the right resource. Whichever level of service you require for whatever type of technology, Get Connected will help you connect with the companies and products you are searching for.
www.rtcmagazine.com/getconnected
Advertiser Index Get Connected with technology and companies providing solutions now Get Connected is a new resource for further exploration into products, technologies and companies. Whether your goal is to research the latest datasheet from a company, speak directly with an Application Engineer, or jump to a company's technical page, the goal of Get Connected is to put you in touch with the right resource. Whichever level of service you require for whatever type of technology, Get Connected will help you connect with the companies and products you are searching for.
www.rtcmagazine.com/getconnected
Company Page Website ACCES I/O Products, Inc................................................................................................... 29.............................................................................................................www.accesio.com Adlink Technology, Inc.................................................................................................... 20, 21......................................................................................................www.adlinktech.com Advanced Micro Devices, Inc............................................................................................. 52................................................................................................ www.amd.com/embedded End of Article Products American Portwell............................................................................................................. 51............................................................................................................ www.portwell.com Commell........................................................................................................................... 22.......................................................................................................www.commell.com.tw
Get Connected with companies and congatec, Inc.................................................................................................................... 19............................................................................................................. www.congatec.us Get Connected products featured in this section. with companies mentioned in this article. Design Automation Conference.......................................................................................... 45...................................................................................................................www.dac.com www.rtcmagazine.com/getconnected www.rtcmagazine.com/getconnected Dolphin Interconnect Solutions............................................................................................ 9.......................................................................................................... www.dolphinics.com Extreme Engineering Solutions, Inc..................................................................................... 2.............................................................................................................. www.xes-inc.com
Get Connected with companies mentioned in this article. IBASE Technology, Inc....................................................................................................... 13......................................................................................................... www.ibase-usa.com www.rtcmagazine.com/getconnected Innovative Integration......................................................................................................... 12.................................................................................................. www.innovative-dsp.com www.rtcmagazine.com/getconnected
Get Connected with companies and products featured in this section.
Intelligent Systems Conference & Pavilion........................................................................... 37................................................................................................... www.issconference.com MEN Micro, Inc................................................................................................................. 23......................................................................................................... www.menmicro.com MSC Embedded, Inc........................................................................................................... 4...................................................................................................www.mscembedded.com One Stop Systems, Inc...................................................................................................... 41................................................................................................www.onestopsystems.com Phoenix International......................................................................................................... 37........................................................................................................... www.phenxint.com RTD Embedded Technologies, Inc................................................................................... 26, 27.................................................................................................................www.rtd.com Schroff............................................................................................................................... 4..................................................................................................................www.schroff.us Sealevel Systems, Inc........................................................................................................ 28............................................................................................................ www.sealevel.com Sensoray........................................................................................................................... 49...........................................................................................................www.sensoray.com Solid-State Drives & Industrial Box PCs Showcase........................................................... 42, 43..................................................................................................................................... WDL Systems, LCC........................................................................................................... 31.......................................................................................................www.wdlsystems.com
RTC (Issn#1092-1524) magazine is published monthly at 905 Calle Amanecer, Ste. 250, San Clemente, CA 92673. Periodical postage paid at San Clemente and at additional mailing offices. POSTMASTER: Send address changes to RTC, 905 Calle Amanecer, Ste. 250, San Clemente, CA 92673.
50
MAY 2013 RTC MAGAZINE