Intel速
Issue 1, 2008
Custom 3-D Tools the Easy Way: ICE from Softimage Now available in XSI* 7
Visual Effects Wizardry Springs from an 8-Core System: Gareth Edwards Tells How
A Game Engine Finds Itself in Unfamiliar Territory: Unreal Engine* 3
16 Featured Artist: José María Andrés— Genetic Genius reative mind José María Andrés unleashes C his CG art—and his genius—on the world.
21 It’s Back, Badder than Ever: Bionic Commando*
Gamespot named it one of the Greatest Games of All Time. Now it’s back in richly rendered 3-D splendor with a healthy dose of extreme graphics.
3 Opening an Architecture to Creative Development
ustom tools are a breeze for ICE, a C streamlined development environment geared for demanding studio needs. 7 Unlocking the Potential of Graphics Processing: Technology Transfer at Its Finest Intel’s own Randi Rost explains the secrets of technology transfer at the university level.
11 Epic’s Unreal Engine* Stops Playing Around: Non-Game Uses Open New Opportunities
25 Home-Grown Production Pipeline Rivals Proprietary Platforms: Attila the Hun Comes to Life through Cost-Effective Tools
Visual effects were rendered nightly on a liquid-cooled Intel® Xeon® processor-based workstation in Gareth Edwards’ bedroom.
29 Multi-Threading Goo!— A Programmer’s Diary Tommy Refenes of PillowFort matches wits with threading and emerges victorious.
34 World in Conflict* Rocket back to the Cold War as we discuss World in Conflict with Massive Entertainment.
Why create games when you can build stadiums, train medical technicians, and make children giggle?
F ro m t h e M
k es D s ’ r o dit E g n agi n a
ivan Chryste Sull d the benefits taking hold, an is gy lo no ch te r essor Multi-processo in graphics proc ed iz al re er g th in ea be w lmy e now w applications se into the ba of parallelism ar ents enable ne of summer ea laid pm s t lo ay bu ve D de ng og e hi D es e yt Th an ork being As th architectures. Intel have been le cores. The w at tip ul re ng m ti he of pu e ts m ag en co ke advant of autumn, ev on the visual arch—including that scale to ta ing up quickly e, university rese g in at in az he go ag is on m y d it w iv an d re on the our ne back. Act done in this fiel ggest that we’ d to introduce su lle l— ri cs te th hi In re ap by e’ gr n w rwritte front, and the computer vances. These programs unde ine, devoted to development ad ith al w of en d ed dr rio A ck l pe pa e ua iv is es os com drenaline. Intel® V cusp of an expl of Intel Visual A ity. Each issue s un ge m ch pa m su e , co th g ng in in ti d and gam sual compu o will be covere kthroughs in vi time away gy stories to lo no ch te to spend some the latest brea s, le ie ar ab in en m be lu ry ve you’ lake, from indust We hope that kayaking on a er works of th st he te as viewpoints w la , e er th m , plications puter this sum st extreme h d practical ap from your com iving your late rv etal, and muc m su e or , th milestones an st to re fo ng di n, we quiet rs who are co ck to work agai king through a agination ba hi im le tt ur se yo do e u game develope at to stimul e. When yo Adrenaline. these articles sports adventur ur creativity. ith Intel Visual w yo e t ar tim st more. Expect e pm m so ju ur and to spend at captivate yo encourage you g you stories th in br to st is te al la go of the Our editorial ep you on top ke d an s, ea id new urney! senses, kindle g. Enjoy the jo tin pu m co al su in vi developments intel®VISUAL VISUALadrenaline adrenalineISSUE ISSUENO. NO.1,1,2008 2008 Intel®
22
Softimage Goes Custom
to Creative Development:
Opening an Architecture
Applications devoted to 3-D animation have a
digital content creators use off-the-shelf
tough crowd to please. The creative aspirations
animation packages, creating custom scripts
of the animators, modelers, technical directors,
when necessary to suit project goals, others rely
and developers who tell stories in the digital
on their own proprietary, home-grown software,
realm have never been higher. Animated films
demanding the flexibility to tweak and modify
from Pixar, Disney, and DreamWorks Animation
the code to achieve effects that are sometimes
SKG set a remarkably high standard for anyone
just outside their immediate grasp.
engaged in this type of media work. While most
With the version 7 release of XSI*, Softimage has a natural crowd-pleaser, opening up the door to customization through an innovative development environment. ICE, short for Interactive Creative Environment, provides a highly customizable framework that lets developers quickly produce the tools that artists need, without the performance slowdowns associated with conventional scripting or the programming burden of writing code. Performance also plays a key role in this release. The extensibility of XSI and ICE are enabled through substantial multi-threading, achieving near-linear scalability for processing as additional cores become available. Softimage has embraced a product roadmap that hews to a multi-core processor philosophy, banking on the flexibility of multiple cores for graphics handling. This leaves the more structured demands of a graphics processing unit (GPU) available for graphics processing, such as real-time shaders. The open-architecture application framework that ICE represents promises to spur creative 3-D development in a way that no off-the-shelf 3-D application has yet achieved.
intel速VISUAL VISUALadrenaline adrenalineISSUE ISSUENO. NO.1,1,2008 2008 Intel速
Before and After: A New Model for Customization In the past, film, broadcast, and game companies typically developed their own toolsets specific to their pipelines in order to meet content creative intent. The highly specialized workflows they used to translate their visions into entertaining, captivating digital content required significant customization to achieve the intended results. Instead of using 3-D animation software as a product, it served as a platform, which could then be modified and extended to meet specialized needs or to obtain a particular effect. Developing plug-ins or accessory scripts to this end, however, was an arduous, timeconsuming process, requiring much expertise, trial and error, and development effort. Artists engaged in content creation typically had to make requests to the development team, wait a number of days or weeks for the custom code to be produced, and then often deal with the frustration of something that was not exactly what they had in mind. Whenever the development team could not respond quickly to the needs of the artists, the entire production pipeline slowed down.
3
Opening an Architecture to Creative Development
The research team at Softimage took a careful look at the trends in the industry and set out to create a dramatically different way of creating, modifying, and interacting with 3-D content. From the resulting design overhaul emerged a unique approach, which came to be known as ICE, to working with 3-D content within an interactive visual environment.
By exposing fundamental functions of the XSI program code within a consistent visual interface, both artists and technical developers can instigate changes quickly without the risk of breaking anything. This approach has the added benefit of presenting a uniform development methodology used by Softimage itself, which supports tool reusability, encourages third parties to participate in the development
process, and offers a reliable platform to evaluate and execute operations in parallel (a key feature that underlies the high performance and throughput of the application). Custom 3-D tools now can often be created in minutes or hours, rather than days or weeks, and then provided to artists and other staff members to use in the production workflow.
ICE: How It Works
Threading is the Key
ICE uses a node-based approach to tool development. To create a special-purpose tool, which is called a compound, the developer combines a collection of nodes or sub-compounds within a data-flow graph, then packages it in such a way that it can be handed off directly to an artist. The node-based structure takes advantage of the fact that the compounds are designed for run-time execution. No compiling is required. This technique produces tools that can be created and refined quickly within the ICE visual environment and then packaged for use immediately.
All the benefits of easy-to-create, custom tools are lost if they don’t perform well in the production workflow. The performance characteristics of ICE were a foremost consideration during development and Softimage gained guidance and direction from the Intel multi-threading playbook, leveraging techniques and tools that have been refined since the early days of hyper-threading and single-core multi-processor systems.
The consistent framework is ideal for developers, whether they are in-house staff members, end-user customers, or third-party custom toolmakers. Reusability is also a plus, as large studios or organizations can take existing tools and perform further modifications or extend the capabilities of third-party special-purpose tools. This flexibility streamlines the production workflow and breaks away from the limitations of relying on C++ code development or scripting languages, both of which are much less direct ways to achieve desired results.
Application Framework
Open Data Figure 1. Open-architecture framework used with ICE.
intel®VISUAL VISUALadrenaline adrenalineISSUE ISSUENO. NO.1,1,2008 2008 Intel®
Standard Scripting and Programming
Visual API
The multi-processor capabilities stem from the manner in which ICE handles its effects graphs. ICE performs two separate operations: first analyzing the graph and then executing. The graph analysis is performed only when modifications are made to the ICE graph. Adding a node, deleting a node, or making a connection all trigger a graph analysis, which is similar in some ways to compiling a program. Whenever output is requested from the ICE graph, the actual execution takes place as a separate thread. Because the analysis and the execution are performed in parallel using individual threads, the application remains very responsive to the user and the performance for complex 3-D scenes is boosted considerably. Even while variables are being changed or the scene environment modified, the execution typically takes place in real time or faster. As additional cores become available, the application design readily uses them to generate additional threads, creating a platform optimally tuned to take advantage of the maximum amount of computing power available.
This open-architecture approach, shown in Figure 1, emphasizes the creative aspects of development and minimizes the amount of work necessary to achieve a particular effect (a factor that can be very liberating for development teams).
4
“ICE provides a platform for technical directors who modify and write their own tools, and it gives them a chance to do this in a much easier and much faster way. If you want to create a 3-D match that is going to light a fire, you can visually code all the information within ICE to build that tool. Then the artist can use the tool and determine how it performs. How much flame do I want? How much smoke do I want? How bright do I want the flame? ICE lets you control whatever you want in the easiest way possible.” - Jennifer Goldfinch, Manager of Strategic Relations, Softimage
This inherent parallelism in compute operations results in scaling with ICE that approaches linearity in relation to the processor cores available as the data set size increases. An upgrade from a system featuring four cores to one with eight cores gives the end user the capability to generate effects on data sets double the size, while maintaining a closely equivalent performance level. A pure linear increase is not possible because of the number of operations that are limited to single threads and the overhead of transferring data to the graphics hardware or other components. However, this overhead becomes less a part of the overall performance as more complex 3-D effects are employed. In bench testing performed by Softimage with quad-core systems, it is not unusual to see processor utilization of 90 percent or more spanning four individual cores. The efficiency of this processing distribution is largely due to the elegance of the multi-threaded code in XSI.
The Move Toward Processor-Based Graphics As a major creator of applications in the realm of digital content creation, Softimage has followed the industry trends closely at the workstation level—one moving toward the GPU; the other, toward the CPU. “This is something that we are watching all the time,” said Bill Roberts, the Director of Product Management. “We experimented with
intel®VISUAL VISUALadrenaline adrenalineISSUE ISSUENO. NO.1,1,2008 2008 Intel®
the GPU, but we made our choice about 18 months ago. We felt that the best return on investment would be in the realm of the CPU. So, for our previous release, we rewrote our entire render back-end to the application. That was GigaCore I core and we made it highly multi-threaded. On that side, we are highly differentiated in that our implementation of mental ray* can render more objects than any other implementation and can render them faster than any other implementation.” “For the next phase, which you see with ICE,” Roberts continued, “the underlying architecture is GigaCore II core, which optimizes creative tasks with a workflow focus on particles and effects. The GigaCore strategy with this release is midway through its planned four stages; workstations and compute will continue to evolve and Softimage will remain on the bleeding edge. All of this work is grounded in providing content creators a better way to bring their visions to life. At the core of our design is the question: what good is power and speed unless it ultimately buys the content creator a better realized idea? I think we all know that in digital content creation, nobody ever finishes the creative task earlier. They just do a better job of it. Everybody works right up to the deadline. That is what we are all about at Softimage: taking the technology and turning it into a creative function.”
XSI* excels at innovative character creation, as shown in this model by the Glassworks, which appeared in a Bank of Ireland TV ad. Image courtesy of Softimage Co. and Avid Technology Inc.
The intricacies of animating believable characters can be accomplished much more easily using custom tools developed with ICE, as shown in this music video still of BJORK produced by UVPH, based in New York. Image courtesy of Softimage Co. and Avid Technology Inc.
55
Opening an Architecture to Creative Development
Tool Sharing is Encouraged
Fire and Ice
Another benefit of the open-architecture model is that the knowledge and expertise gained in the creation of XSI tools can be freely shared, through tools that are exchanged, without the risk of revealing underlying processes or some particularly brilliant bit of math that was used. “One of the things that we have been really focused on for version 7,” Roberts said, “is how our user community will be able to work and help each other. With the compounds that people are creating to do certain effects, often a lot of them will be very proprietary to the studios that use them for a project, so they can’t give all the details. But, they can lock it down and provide the end result to other people and share tools among the larger community. We have created an entire community site (community.softimage.com/). We are hoping and expecting to see lots of sharing of different tools.”
Customized effects and animation are now an integral part of the Softimage XSI workflow. The flexible, high-performance framework that ICE offers places unprecedented capabilities into the hands of developers and artists. Combined with the solid performance increases resulting from extensive multi-threading throughout the program, the SOFTIMAGE|XSI package now has particular appeal to studios and independent producers who need the ability to craft custom effects and view the results swiftly. The ease with which developers can design custom tools, coupled with the simplicity with which artists can adopt these tools in their everyday workflow, heralds a new creative milestone in this industry.
“We are trying to empower the people who are already extending the product,” Roberts continued. “We want to allow more people to inject innovation into XSI by making it easier and building a community around ICE. We suspect there will be some people who put up complete compounds that perform certain functions, but we expect to see more building-block ideas. So you might have a sub-compound that creates a particle cloud with an upward force based on it. Someone could grab that, combine it with a randomized node, combine that with another function, and end up with a simulated fire—just by dragging in some pieces. This is huge. Because it ties into a strong industry trend: there is an ever-increasing demand for pre-visualization across all industries—being able to lean on the community early gets your rough ideas documented faster.”
intel®VISUAL VISUALadrenaline adrenalineISSUE ISSUENO. NO.1,1,2008 2008 Intel®
Whether the end goal is a fire that burns more brightly, a wave that crashes more furiously, or a wisp of hair that drifts across the heroine’s forehead more realistically, the capabilities of XSI and ICE make the goal easier to achieve.
•
66
Unlocking the
Potential of Graphics Processing: Technology Transfer at Its Finest BY Chryste Sullivan
How do technology ideas propagate through the world at large and somehow become real? What does it take to capture the imaginations and talents of the students and researchers in universities worldwide and engage them in exploring and using a new graphics architecture? These questions occurred to me recently while thinking about Intel’s new graphics architectures. Years ago, British biologist Lyall Watson postulated the Hundredth Monkey Phenomenon after observing that once a group of monkeys on a remote island learned how to wash their potatoes in water, monkeys on nearby islands were soon observed doing the same thing. Watson speculated that maybe there was a group consciousness that somehow percolated across vast distances among members of the same species. If such forces could be applied to transferring the technological skills involved in graphics processing, Randi Rost would be out of a job.
intel®VISUAL VISUALadrenaline adrenalineISSUE ISSUENO. NO.1,1,2008 2008 Intel®
Randi, an experienced computer graphics professional at Intel, is responsible for getting the word out about new Intel graphics architectures. He finds ways to reach university students, educate and inform researchers, engage ISVs and developers, and, generally, prepare the ecosystem for Intel’s next-generation graphics architecture. My curiosity piqued, I talked with Randi about his thoughts on the potential for visual computing on next-generation graphics architectures and the ways that Intel is helping build software engineering expertise in this area. Randi discussed the many ways he and Intel are trying to make the learning curve less steep.
How long have you been involved in computer graphics? I discovered my passion for computer technology as a sophomore in high school. In the mid-70s, Minnesota became the first state to have a statewide computing network linking all the universities and secondary schools. I discovered my passion for graphics when I bought myself an Apple* II computer during my first year of college. I went to graduate school to study computer graphics specifically, and I’ve always pursued jobs that have been connected to this creative and fascinating field. I’ve worked at startups and at big companies like DEC and HP, always looking for the spots where interesting work was being done in computer graphics.
How did you get started with Intel? The day after our high-end graphics development team at 3D Labs was laid off, Intel arrived to discuss an upcoming project involving major advances in graphics processing. It didn’t take very long to see that Intel was doing something that was going to really change the industry. I’m not just saying that because it’s a tagline, but given my background in the computer graphics industry, it was clear that Intel had a very compelling story with this new graphics architecture.
77
Unlocking the Potential of Graphics Processing
What kind of work is your group engaged in at Intel? Our group is the primary development team for software development tools being created for Intel’s newest graphics architecture. We work very closely with the Intel teams that provide the drivers and hardware. We also work very closely with other groups within Intel, such as the Intel® University Program and the Intel® Software College.
What are you doing to bring universities on board with new Intel technologies? I interact with ISVs and university collaborators, collecting feedback on the new software development tools for many-core graphics processors, providing training, and building industry momentum for Intel’s visual computing efforts. Because many computer systems departments dropped their multi-core focus a few years back—or ignored it altogether—there’s a big push to encourage the development of a multi-core motif today. And graphics programmers can no longer avoid familiarity with new architectures and multi-threaded applications if they expect to maximize their own area of expertise in the future. Hence, university support has grown more important. We felt that it was vital to get the message out to key visual computing researchers and leaders in academia who could put breakthrough visual computing technologies to use. So, Intel launched a program with a few top-flight research institutions. We make sure we have the right schools and that they’re getting the right support. In some cases, our support includes providing grants to universities that
intel®VISUAL VISUALadrenaline adrenalineISSUE ISSUENO. NO.1,1,2008 2008 Intel®
are leaders in the graphics research space. We educate them on the new graphics architecture and then give them the grants so that they can begin targeting their research efforts in that direction.
What is the value in getting universities involved early on? Universities are where a lot of new technology gets dreamed up, where new algorithms get invented. Universities, particularly in the visual computing space, have been relatively shackled by the existing graphics hardware capabilities—where the entire rendering pipeline has been built into fixed-functionality silicon. This provides scant flexibility for researchers to innovate in term of rendering algorithms. Recently, within the last half-dozen years, the hardware pipeline has gotten to be more programmable, but there are still a lot of constraints.
With our upcoming graphics architecture, built around a completely general-purpose CPU-based design, we’re basically removing all the constraints for the rendering pipeline. We’re telling researchers: “Hey, here’s an architecture where you can effectively do everything you want in software. There’s no fixed functionality to get in your way. If you want to experiment with new rendering algorithms, with ray tracing, with hybrid rendering systems, if you want to replace the rasterization unit, if you want to have procedural geometry so that you can render spheres analytically (rather than breaking them down into polygons)—all of those things are possible.” It’s a completely open, general, high-performance platform for highly parallel floating-point workloads, such as graphics. And, the basic programming model is simple: C++ code that targets x86 cores.
Have you been able to attract any high-level talent with this initiative? Yes. We’ve been able to attract interest from some of the world’s top visual computing researchers. With our targeted grant program, we’ve been able to get the attention of folks like Pat Hanrahan of Stanford. He is widely considered one of the top graphics researchers worldwide. This year we have a grant that we’ve provided to another top researcher, Fabio Pellacini of Dartmouth. He gets one or two of his papers published at SIGGRAPH every year. Collectively, the research talent that is already working on innovating with our new graphics architecture is mind-boggling. We also get more than time and mindshare as university researchers look at what our next-generation graphics architectures can do.
88
Unlocking the Potential of Graphics Processing
A number of graduate students gain a great amount of exposure to our new graphics architecture, including the tools and APIs that we’re providing. Many of these individuals will become valuable contributors to the industry and, hopefully, some of them will be attracted to Intel and come to work for us when they graduate. We also intend to have a huge impact on computer science curriculum as it is taught in universities worldwide.
How can you teach developers and students about new graphics architectures before hardware is available? When Intel is developing a new graphics architecture, we also create a pre-silicon development environment that allows us to develop code that runs on a simulated version of that new architecture. The future of graphics computing is relying on a roadmap of ever-expanding processor cores and on software’s ability to capitalize on the massive performance benefits. But if we expect the next generation of graphics developers to hit the workforce ready to rock, somebody had better think beyond loosely slipping a couple of courses in parallel programming into the computer science curriculum.
One of the biggest challenges for developers is to devise ways to turn their graphics algorithms into code that runs well on massively parallel systems. We haven’t had a lot of history training people to think in parallel terms. As humans, we have traditionally serial thinking patterns as we move through time and life; we’re on a single path of execution. However, with many-core systems, the key is to think of ways to break problems into separate threads that can take advantage of the many cores in the system. Thinking in parallel is vital; we’ve invested a lot of effort at Intel into getting this new way of thinking quickly developed. Next-generation algorithm developers will need to understand and overcome a fundamental challenge: how to break down their problem into something that can be spread across a multitude of cores for maximum performance. You don’t want the students and researchers to just learn a bit of coding. You want them to think—and to think in parallel. There are tremendous opportunities for this kind of work: combining parallelism and rendering algorithms in new ways.
How can visual computing researchers get involved with the program? We are working on plans to expand our graphics-architecture research program. You can tell us about your interest and your current research directions by sending an e-mail to us at lrb.sdk@intel.com. And when our new graphics architecture is launched, we are hoping to have them in university research environments everywhere. Our goal is to make the new Intel graphics architecture the innovation platform of choice for visual computing researchers.
Liberating Developers and Graphics Innovation The conversation with Randi made one thing clear: developers will soon have a wide-open path to imagine, design, and create breathtaking works in the digital content creation realm without having to invest time or money in new development tool sets. At the end of our talk, Randi summed it up nicely, “You have the processing power and it’s completely flexible and completely general—you can innovate to your heart’s content.”
•
“ You have the processing power and it’s completely flexible and completely general—you can innovate to your heart’s content.” — Randi Rost, External Relations Manager, Graphics, Intel Corporation
about the author: Randi Rost Randi Rost is a seasoned veteran in the field of high-performance graphics hardware and software products, with over 25 years of pioneering development experience. He is also the author of several books, including OpenGL® Shading Language. In his current role at Intel as External Relations Manager, Graphics, Rost has been deeply involved in the design and launch of the graphics architecture, opening channels to educate and inform researchers, ISVs, developers, academics, and students.
intel®VISUAL VISUALadrenaline adrenalineISSUE ISSUENO. NO.1,1,2008 2008 Intel®
99
Unlocking the Potential of Graphics Processing
lation u im S d n a s ic h p ra G r fo ramming Abstractions
Rethinking Prog
Graduate students Kayvon Fatahalian and Jeremy Sugerman are
In parallel with GRAMPS, research at Stanford continues to
engaged in a research project sponsored by Intel at the Stanford
advance the OpenGL* and Direct3D real-time graphics pipeline.
Graphics Lab, investigating new ways to enable advanced real-time
Although future throughput computing architectures will most likely
graphics on next-generation many-core processor and GPU
give developers the flexibility to implement their own rendering
architectures. Their work involves rethinking programming abstractions
pipelines from scratch, Fatahalian and Sugerman believe there will
for graphics and simulations to strike a balance between providing
continue to be significant value in providing applications with a
increased flexibility to applications and maintaining desirable
standard graphics-specific programming abstraction and
properties of present-day graphics pipelines, such as high absolute
high-performance implementation.
performance and code portability. About their research, Fatahalian said, “In the past year, we have
“It is likely that future rendering systems will not rely entirely on traditional rasterization, or entirely on ray tracing, but will employ a
developed GRAMPS, a programming model for many-core throughput
combination of image synthesis techniques,” Sugerman said. “We are
programming that places emphasis on the needs of visual computing.
currently developing a hybrid rendering pipeline that pushes past the
In contrast to many prior models, GRAMPS explicitly targets
capabilities of today’s OpenGL and Direct3D pipeline to support REYES
heterogeneity in hardware (that is, fat cores—typical of traditional
rendering and ray-traced effects. The goal is to understand how
CPU designs—as well as thin cores, providing maximal arithmetic
these off-line techniques are best implemented and, perhaps more
capability and fixed function units). GRAMPS targets workloads,
importantly, composed in a real-time context.”
such as advanced rendering, that are irregular and data-dependent. In pursuit of these goals, GRAMPS requires applications to create software-constructed computation pipelines and graphs whose stages are defined by conventional threads, data-parallel shaders, or dedicated fixed-function logic. Stages are connected asynchronously by means of queues that GRAMPS uses to capture producer-consumer parallelism and schedule work efficiently in large, coherent blocks—despite inherent irregularity in application workloads.” Early development using GRAMPS has yielded three prototype renderers: a simplified Direct3D* pipeline, a packetized ray tracer, and a hybrid combination of both techniques. Preliminary research has studied the performance of these renderers on two simulated platforms: a symmetric many-core processor configuration and a heterogeneous configuration anticipating the fusion of both thin and fat processing cores. Fatahalian and Sugerman hope these simulated evaluations of GRAMPS serve to indicate plausibility of key GRAMPS ideas and to positively influence future industry architecture and system designs.
intel®VISUAL VISUALadrenaline adrenalineISSUE ISSUENO. NO.1,1,2008 2008 Intel®
“We are currently developing a hybrid rendering pipeline that pushes past the capabilities of today’s OpenGL* and Direct3D* pipeline to support REYES rendering and ray-traced effects.” — Jeremy Sugerman
10 10
Epic’s Unreal Engine* Stops Playing Around: Non-Game Uses Open New Opportunities Why create games when you can build stadiums, train medical technicians, and make children giggle? When is a game engine not a game engine? The answer is less a response to a riddle and more a preview of possible future applications of 3-D game engines, as innovative companies and individuals extend the capabilities of Epic Games’ popular Unreal Engine* 3 to nongame uses. One striking example is the 3-D walk-through of the Dallas Cowboys’ stadium, masterminded by the architectural firm HKS, which allows potential clients to navigate through 3-D space and visualize the end result of the project. Visualization of this sort also factors into training applications, as a training program created by Virtual Heroes for emergency medical technicians illustrates. Even the entertainment industry is finding the lure of game engine technology irresistible. Nickelodeon’s animated series, LazyTown, was developed around Unreal Engine 3. Visualization, training, and entertainment are among the many potential uses of the game engine technology created by Epic Games, and the underlying platform based on Intel® Core™ microarchitecture enables the primary capabilities of these applications.
intel®VISUAL VISUALadrenaline adrenalineISSUE ISSUENO. NO.1,1,2008 2008 Intel®
11 11
Intricate modeling was required to capture the interior details of the Dallas Cowboys’ stadium.
What’s In a Name? The term game engine refers more to its typical uses than its actual capabilities. Unreal Engine 3, the latest manifestation of Epic’s prowess, includes built-in capabilities with applications that go beyond the expected. As the 3-D rendering and modeling engine that drives Unreal Tournament* 3, as well as a lengthy list of Triple-A video games, Unreal Engine 3 engine has earned a place of respect on gaming consoles and PC platforms where alien worlds unfold, battles rage continuously, and the mysteries of unexplored terrain lure gamers into rapt 3-D immersion. Of the many strengths native to Unreal Engine 3 design, one of its strongest attributes is the multi-threaded nature of the code. Working in close collaboration with Intel, Epic built parallelism deeply into the engine core. As a result, Unreal Engine 3 exhibits the scalability and performance perks characteristic of well-tuned multi-core processor platforms and is well positioned to take advantage of the seismic shift from dual- and quad-core processors to eight-core and beyond.
World’s Largest NFL Stadium Walkabout When Dallas-based architectural firm HKS started out designing a new home for the Dallas Cowboys’ football team, it began exploring ideas for helping its clients and stakeholders get a clear sense of the ultimate project well before the first groundbreaking ceremony was held. With considerable in-house expertise in 3-D modeling and rendering applications, the firm selected Unreal Engine 3 as the tool-of-choice to create an intricately detailed walk-through model of the stadium, providing viewers the freedom to explore and get a sense of interior look-and-feel and layout.
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
HKS is no stranger to Unreal Engine technology. Some of the forward-looking staff members have been exploring the possibilities of various 3-D real-time rendering options for years. As far back as 1998, Dave Chauviere, principal and CIO, and Pat Carmichael, HKS’s Manager of Advanced Technologies, had been experimenting with the Unreal Engine. An early project recreating the HKS headquarters building as a 3-D model convinced them that the speed and scalability of Epic’s engine were ideal for their needs. Since then, staff members have produced a number of custom tools to work with the engine and HKS plans to have all of their design presenters using the Unreal Engine to drive displays during pitch meetings. “Architects want every detail in the building to be accurately precise, as close to the real environment as possible,” said Carmichael. “There are thousands of surfaces in a typical building that we do, especially with the scale of buildings that HKS does. We have to have a lot of different texture surfaces simulated in these environments, and it’s a lot of work.” The strength of this form of visualization is that it circumvents a common problem in communicating with clients: getting them to bridge the difference between a 2-D image or architectural plan and the final, real-world building, where lighting, textures, colors, and spatial dimensions define the character of the interior space. HKS has used the Unreal Engine successfully in a number of other visualization projects, including the Indianapolis Colts’ stadium (opening in 2009) and the Liverpool, England Soccer Stadium (still on the design boards). “Unreal technology has been instrumental in selling the Cowboys, Colts, and Liverpool projects,” said Bryan Trubey, Principal Director of the HKS Sports & Entertainment Group. “Epic’s technology allowed us to develop animations and walkthroughs that brought these structures to life in the presentations.”
12 12
Epic’s Unreal Engine* Stops Playing Around
Cost-Effective Television CGI The confluence of computer and broadcasting technology has made it possible to merge and blend media in previously impossible ways and has also helped produce some extremely creative hybrid media works. One such work is LazyTown, a children’s television show originated by an Icelandic athlete, Magnus Scheving, and produced by Dutch-based animation innovator, Raymond P. Le Gué. Scheving set out to create a show that encouraged more active, healthier lifestyles among children, and Le Gué helped bring this idea to reality with a blend of real- and computer-assisted techniques, ranging from puppetry and live acting to animation and computer-generated special effects. Le Gué also directs some of the LazyTown episodes and has earned respect for advancing a number of techniques that are helping to simplify television production and bring down costs, including improvements to 3-D computer animation, virtual television, and cinematographic production. For over a decade, Le Gué devoted his expertise to developing pipelines for real-time virtual television production, and real-time motion capture and animation. This work resulted in an architectural framework Le Gué calls XRGen4, which is where he takes advantage of Unreal Engine 3 as the image generator.
live actors are placed in front of the computer-generated virtual background. This successful, award-winning show reaches an international audience (currently comprising 103 countries) and has garnered three Emmy nominations. In 2006 LazyTown was named Best International Children’s Show by the awards committee of BAFTA in the United Kingdom.
Getting Serious about Games A class of video games dubbed serious games focuses on advanced learning technology, essentially using 3-D scenarios to stage various training venues and educational exercises. All of the 3-D realism, character interaction, and artificial intelligence features we’ve grown accustomed to in modern games are recast in a framework where the objective is to convey real-world experience in a simulated environment. As you might expect, these kinds of serious games have serious applications, such as training medical personnel how to react to catastrophic situations. And once again, Unreal Engine 3 shines in these kinds of applications. Based in Research Triangle Park, North Carolina, the digital content creators at Virtual Heroes used Unreal Engine 3 in their Emergency Medical Services (EMS) training game,
(LEFT) The director’s view of LazyTown shows the virtual backgrounds merged with the live and animated foregrounds.
(RIGHT) Navigating a city in Zero Hour is much like moving around within a game environment.
“UE3 is an important part of our system,” said Le Gué. “It is the tool with which we create and render our virtual sets. Until now, we have been using our own dedicated rendering engines, but when we saw the performance of Epic’s game engine coming closer and closer to our needs, we decided this is something we needed to investigate.”
Zero Hour: America’s Medic, commissioned by the George Washington University Homeland Security Policy Institute. Jerry Heneghan, Virtual Heroes’ founder, gained expertise in Unreal Engine technology in previous projects, such as the America’s Army* online game, and saw the wisdom in capitalizing on the latest engine capabilities.
In the approach Le Gué has adopted, activities on an actual stage can be merged with the virtualized elements of LazyTown, which includes a 3-D-rendered mock-up of the town itself. Using a green screen for chroma key effects, the
“Zero Hour is intended to fully take advantage of all the bells and whistles of Unreal Engine 3 while creating a uniquely suspenseful, immersive, virtual world for training real medics,” explained Heneghan. “UE3 features that we took advantage
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
13 13
Epic’s Unreal Engine* Stops Playing Around
Emergency medical technicians can train in a dynamic environment where they confront a variety of challenges and changing conditions.
of included volumetric environmental effects, on-the-fly real-time shaders, pre-computed shadow masks, directional light maps, particle physics, and environmental effects.” Greg Lord, associate director of the National EMS Preparedness Initiative, who originated the project with Virtual Heroes, sees future applications that go well beyond the version of Zero Hour that will be available for download this summer (www.nemspi.org). He foresees a framework within which training for first responders—including American Red Cross personnel, firefighters, police departments, and emergency management staff members—could be presented using the same serious game approach. “Think of what you could do if you could create what amounts to a Second Life* for disaster response,” Lord speculated. “Providers across all the disciplines of a large-scale event could operate in real time and on an ongoing basis. We could design a program for the city of New York,
and they could run their own virtual drills. The same could be done for San Francisco and Los Angeles.” In the meantime, developers looking for a potentially valuable niche market would do well to explore the training possibilities of Unreal Engine 3.
Short-Form Animation Flourishes: Chadam A new Web series, Chadam, being distributed by Warner Bros. Television, relies on Unreal Engine 3 for producing animations that bring Hollywood-style values to an Internet-scaled show. Jace Hall, whose career includes a stint as head of Warner Bros. Interactive Entertainment, founded HDFilms as a vehicle for productions of this kind. In his work then and now, he pays careful attention to the capabilities and utility of game engines on the market. Hall chose Unreal Engine 3 for use in Chadam after becoming familiar with its CGI visual capabilities that he feels are better than
Non-Commercial Licenses Boost Platform Value The combined technologies of Unreal Engine 3 and the platforms based on Intel® Core™ microarchitecture have extended value when applied to applications outside the gaming realm. Non-commercial licenses available from Epic Games support a variety of application possibilities in education, training, and other areas, and can generally be obtained easily under a liberal, usage-oriented licensing policy.
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
14 14
Epic’s Unreal Engine* Stops Playing Around
Jace Hall, the founder of Monolith Productions, has created numerous hit games using a variety of 3-D technologies. His choice of Unreal Engine 3 to animate the characters in Chadam’s world is a strong vote of confidence for Epic’s technology.
the original Pixar Toy Story animated film. The use of Unreal Engine 3 also supported a model of fast and fluid storytelling, which was essential to Hall’s intentions. “Epic tries to make the technology as easy for other people to use as can be for as complex a system as it is,” said Hall. “The difference between what Unreal is today versus what it was back in the day is that it’s a much more elaborate toolset than it ever was before. They’ve created the kind of user interface that a film or TV production would need to almost be used real time in these situations. Using UE3 is about delivering high-quality, 3-D animated stuff under very explicit cost controls. You make trade-offs in terms of what you can do and how much it’s going to cost. What they’ve done with the toolset by adding tools like Matinee*, which is literally a cinematic sequencing tool that interfaces with the engine, is make cost-effective 3-D possible.” The initial run of the series will be 10 episodes, each five-minutes long. Hall has slated one year for the entire production run, a comparatively short timetable for a project of this magnitude. In the series, the lead character, Chadam, a creation of artist Alex Pardee, tries to stop a serial killer, Viceroy, from destroying the world. “Our only limit is our budget, but even with this budget, we’ll be able to tell a much more compelling story with Unreal than we could at the same budget with live action movie cameras,” said Hall. “We will set a new bar for short-form 3-D animation for the Internet with Chadam.”
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
Playing the Multi-Core Card Tim Sweeney, founder of Epic games, sees strong potential in the use of multi-core processors to carry out individual graphics-related tasks, providing more than just increased performance. Parallel operations can be used to strengthen artificial intelligence, perform physics calculations, and carry out programmable special-purpose functions that can improve game realism and responsiveness. At the 2008 Game Developers Conference in an interview with TG Daily, Sweeney said, “It would be great to be able to write code for one massively multi-core device that does both general and graphics computation in the system. One programming language, one set of tools, one development environment—just one paradigm for the whole thing: large-scale multi-core computing.” Sweeney continued, “That time will be interesting for graphics, as well. At that time, we will have a physics engine that runs on a computing device, and we will have a software renderer that will be able to do far more features than you can do in DirectX* as a result of having general computation functionality. I think that will really change the world. That can happen as soon as the next console transition begins, and it brings a lot of economic benefits there, especially if you look at the world of consoles or the world of handhelds. You have one non-commodity computing chip; it is hooked up directly to memory. We have an opportunity to economize the system and provide entirely new levels of computing performance and capabilities.”
•
15 15
Featured Artist:
José María Andrés: Genetic Genius Creative mind José María Andrés unleashes his CG art—and his genius—on the world.
Some people are just born with it—that genetic code of creativity that forces its way out of every pore until one day, something magical happens. In this case, the magic came in the form of CG artist José María Andrés, receiving his very first PC. The year was 1994, and a 16-year old José was bursting with creativity, not unlike that of a young Pablo Picasso—for whom José shares no less than two names.1 An avid and motivated artist, José broke into CG arts by spending uninterrupted hours—many of them—in front of his PC. “I’ve been interested in art since I was a kid and always loved gadgets and computers,” José explained. “So it made sense that I mixed it up and finally got into CG arts.”
achieved “Master in Animation” in the Spanish Trazos School of Arts. But José’s motivation wasn’t driven necessarily by that of school. Of teachers, mentoring, and education, José had this to say, “I would have liked to have had a mentor, but 90 percent of my skill comes from me—from multiple hours creating and destroying scenes. I had amazing teachers in musical formation, piano, and in Cinematic Arts—but when it came to CG I have to say, I did it all myself.”
Largely self-taught and just four short years after he got his first PC, José attended Salamanca University in Spain where he got his Bachelor Degree in Fine Arts. He also
So maybe it’s not just genetic genius, but also an artist’s extreme drive to succeed.
1
ablo Picasso’s full legal name is: Pablo Diego José Francisco de Paula Juan Nepomuceno María P de los Remedios Cipriano de la Santísima Trinidad Martyr Patricio Clito Ruiz y Picasso.
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
16 16
FEATURED ARTIST: JOSÉ MARÍA ANDRÉS
Inspired Life José has worked on some amazing projects, many of which helped open doors for him. “I am a funny, happy guy who leads a very happy life,” explained José. “And that’s what inspires me every single day. I love to create scenes where images tell a funny short story.” Like a colorful piece titled Fireflies: A Light Digestion that José created and published in 2006 about a chameleon that ate a firefly—it landed him his first job in the UK. In his second chameleon piece, he did just as well. “In four days I took my chameleon and created another
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
story about him choosing to eat the wrong fly and well . . . we all saw the consequences (laughs). I called it Choose Your Captures and I got two awards and a write-up in two publications for it.”
opportunity to meet brilliant artists and to work on a very big project. It was difficult because I had to texture a lot of characters with very fine detail and under a very tight deadline.”
José also worked on a project with RealtimeUK called Buzz Junior Monsters and recently worked with Nexus Productions on a new advertisement for Coca Cola. “These have been the brightest highlights of my career,” said José, as they stretched him personally and artistically. On his recent work with Coca Cola, José continued, “It was my first job in London and an amazing
Currently freelancing, José doesn’t have an official artist’s studio, “I’ve been happily working at Nexus Productions—it looks like a messy place full of brilliant people and ideas,” he explained. “And you can smell the studio feeling around every corner.” Feeling that José is pursuing every kid’s ideal dream of never growing up, I had to ask—what does a CG artist like
17
Featured Artist: José María Andrés
José do for fun other than creating fun for other people? He had this to say, “I love to go to the cinema, play piano, and feed the squirrels at St. James’s Park.2 Another big hobby of mine is singing a cappella. I would love to sing in a group, but it is very difficult to find people and time so I decided to sing alone, recording different tracks and then mixing them to create my songs.3 Nothing professional, but very funny!”
Artist’s Arsenal Instrumental to his work as a rockin’ and celebrated CG artist—not to mention satirical singer—is the very best hardware José can get his hands on—oh, and he likes to build his systems himself. “I’m using a workstation I built three and a half years ago,” José discussed. “It’s a 64-bit dual-core Intel® Xeon® processor clocked at 3.4 GHz with 4 GB of RAM and an ATI FireGL* 7100 graphics card. Now I’m going to buy the components to build another machine—only this time it’ll be a powerful gaming PC rather than a workstation. I’m getting an Intel® Core™2 Quad processor clocked at 2.83 GHz with 8 GB of RAM, and an NVIDIA GeForce* 280 graphics card. Why am I moving from a workstation to a gamers’ PC? I use Autodesk 3ds Max* to create my work, which is optimized to get the greatest performance using DirectX*. Gamers’ graphics cards are highly optimized for this technology and also, if you want to play you can (laughs).”
But recently José has dabbled in new-to-him software that is quickly rising to the “can’t live without” list. “Two tools I consider very important for my art are ZBrush* and Adobe Photoshop plug-in Filter Forge*. A lot of people know about ZBrush, but Filter Forge is still a bit of an unknown. With it you can create your own filters and generate seamless textures, extract the maps for 3-D packages—like normal, bump, and diffuse—and it really is all you need when you want to produce the best quality in your textures.”
Chasing Burton Like many CG artists, José has particular ones that he looks up to. “I absolutely adore Tim Burton´s work,” discussed José. “He is an amazing story teller and an outstanding artist who has created a new and very well defined style. Walt Disney is also an idol for me. He was a visionary and has been a great reference for kids and adults since the ‘30s.” And there’s no surprise there. Both Burton and Disney were masters at not growing up while insisting that the world not grow up right along with them. Although he doesn’t see CG art in the museums of our future alongside Picasso and Renoir, José does defend CG as a true art.
— José María Andrés, Freelance CG Artist
On the software he uses, José was adamant, “My entire life I’ve been a passionate CG lover. I’ve learned and tried almost everything, but in the end, I chose Autodesk 3ds Max, V-Ray*, Adobe Photoshop*, and VMWare Fusion* as my personal pipeline. My official stance? I cannot live without them.”
2
To see where José feeds the squirrels, visit www.royalparks.org.uk/parks/st_james_park/
3
To listen (and laugh) along with José’s songs, visit his Web site at www.alzhem.com/sub1/music.htm
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
“My art can be saved and shared with everyone through the Internet,” said José. “But there is only one Picasso Guernica4 hanging in a museum. Only one to share.” Don’t get the wrong idea though. José is also a painter whose work could very well hang right alongside the man with whom he shares two names.
18
Featured Artist: José María Andrés
Dreaming Big As to where he sees himself in 10 years, José had this to say, “Five years ago I saw myself in the US, but due to my lack of experience it was quite difficult to get a job. Three years ago, I saw myself working hard in my own studio. Now we’ve just moved to London, and honestly, I have no idea where I’ll be in 10 years. I am open to whatever the future brings me. But I’d love to work with Tim Burton and to work at Pixar. Everyone dreams the same I guess.” So what advice does José have for CG artists just getting started? “If you can, look for a very good school with internship possibilities,” advised José. “If school is not an option for you, don’t give up. The Internet and Google are your best friends. Look for forums and tutorials. Ask what you don’t know and make a lot of “CG friends.” If you are good as an artist and also as a person you’ll find your first job.” José also intends on giving back, “I’d like to thank the CG community for all I’ve learned from them. And I hope I can contribute by teaching people with my tutorials.” I have a feeling that CG artists of the future will be chasing José María Andrés. As for Pixar and Tim Burton—look out boys and girls, José’s coming for you—and he’ll be singing and laughing the whole way.
4
•
Guernica by Pablo Picasso can be viewed in the Museo Nacional Centro de Arte Reina Sofía in Madrid, Spain.
About the Author: Sylvia Flores Self-professed gadget girl and writer Sylvia Flores has been marketing fun electronics and high tech for the last 12 years. In her spare time she writes, flies airplanes, flings paint, and tries to invent new products that she can patent to make millions. So far, no takers.
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
19
Don’t wait for a blue moon . . . to reveal your vision. Scratch that creative itch and get your visual adrenaline glowing today. Let Intel help spark your innovation in visual computing. Subscribe to Intel® Software Dispatch.
www.intelsoftwaregraphics.com
Image created and provided by José Maria Andrés– http://www.alzhem.com. José rocks the CG arts with a custom-built computer based on two Intel® Xeon® processors clocked at 3.4 GHz, 2 MB of L2 cache and an 800 MHz Front Side Bus. Modeled in Autodesk 3ds Max* | Rendered using V-Ray*.
©
2008, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries. * Other names and brands are the property of their respective owners.
20
Badder than Ever: Bionic Commando* It’s Back,
BY lee purcell
Stretching the Boundaries of Extreme From the retro archives of gaming history, a popular Nintendo title, Bionic Commando*, has been resurrected in richly detailed, 3-D glory by GRIN, a Swedish development firm, and showcased on a multi-core gaming machine monster, Skulltrail. More formally known as the Intel® Dual Socket Extreme Desktop platform, Skulltrail is an eight-core system featuring two four-core Intel® Core™2 Extreme processors. This is the configuration that showcased the pre-release version of Bionic Commando at the 2008 Game Developers Conference (GDC) to an
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
enthusiastic audience. The extra processing muscle delivers greater detail, better physics, and exceptional effects, or as Intel applications engineer Orion Granatir noted, “Basically, it allows them to turn the dial all the way up on the game.” Turning up the dial—always a worthy endeavor—is something best accomplished through a well-calibrated blend of hardware and software. Parallelization was integral to the game development. “The nice thing about this title,” Granatir said, “is that it takes advantage of multiple cores, so as we start getting these machines out there with more and more
21
It’s Back, Badder than Ever: Bionic Commando
cores, eight cores, we’re actually seeing titles that take advantage of it. Intel can help developers on multi-core in a number of ways. One of our main goals, through online resources, training sessions, and university courses, is to show developers how they can do multi-core, how they can design multi-core techniques into their games. There were a lot of titles early on where they were just trying to patch in multi-core with Intel, just saying, this is the future. Now we’re doing sessions with companies, we’re teaching them, we’re doing literature, showing them how they can start taking advantage of our multi-core from the ground up.”
air defense grid now in the control of a massive terrorist force whose goal remains unclear, the FSA have only one option left: a behind-the-lines assault—the perfect job for a Bionic Commando.
GRIN kept enough of the flavor of the game to make it recognizable to those who may have played the original, but the current 3-D renderings are a vast improvement over the earlier 2-D artwork. It’s a richly detailed world of towering buildings, suspended roadways and monorails, deep canyons, and sheer rock faces, where every environment is scalable using swinging, scaling, climbing, and wall-walking David Potages, a Senior techniques. Moving all of Engine Architect at GRIN, these pixels, textures, “ WHEN IT COMES TO DELIVERING INNOVATION TO was heavily involved in models, and shadows around THE ULTIMATE ENTHUSIAST, OUR NEW 8-CORE threading Bionic Commando. a complicated, rubble-strewn DESKTOP PLATFORM IS A WINNER. THE “I have been at GRIN for world takes massive GROUNDBREAKING INTEL® DESKTOP BOARD three years,” Potages amounts of processing said. “I’m working with the power. The game certainly D5400XS ENABLES THE FLEXIBILITY TO PAIR A engine, improving the way doesn’t require an eightVARIETY OF QUAD GRAPHICS SOLUTIONS WITH it works. Threading the core system for a satisfying TWO OF OUR FASTEST DESKTOP PROCESSORS. renderer is the first thing to experience, but it knows do and it takes some time. how to use those cores THE RESULT IS STUNNING PC PERFORMANCE.” Because it is quite tricky, it when available and the — JEFF MCCREA, SENIOR VICE PRESIDENT AND GENERAL MANAGER, is really important to start result is a visual delight INTEL’S DIGITAL HOME GROUP very early. With eight cores to game aficionados. we’re getting better physics and effects, as you can see, like explosions, like effects with particles. I used Intel® VTune™ Tuning, More Tuning, and Optimization Performance Analyzer. You get pretty good interaction with The development team at GRIN focused their optimization your engine, as well.” GRIN was also working with Intel on efforts for the game on platforms powered by Intel® Core™ GRAW2, taking advantage of multi-core to enhance microarchitecture, tailoring the code to take maximum multi-player interactivity and responsiveness. advantage of the available cores in a system as well as
Messing with a Classic Bionic Commando, the Nintendo classic released in 1988, equipped the hero with a useful grappling hook for swinging through the 2-D, side-scrolling game space. Though the latest iteration, published by Capcom, takes place in a far more interesting 3-D environment, the grappling hook is back with additional capabilities. Not only is the mechanical appendage handy for swinging across menacing abysses, but the bionically modified agent, Nathan Spencer, can use it to toss heavy objects around, traverse vertical inclines, and dissuade opponents from getting too close. Agent Spencer has an attitude in this new release. Having been framed for a number of crimes, he has been sentenced for execution when a mammoth explosion turns the city into rubble and Spencer is freed by his captors to help out the government in the post-apocalyptic world. With the terrain in ruins and the city’s
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
the presence of features, such as Intel® Streaming SIMD Extensions 3 and 4.1. “We also think that Direct X* 10 support is important and that it requires some optimizations,” Potages said. “The good thing is that DX and drivers are also starting to use the available cores in an efficient way.” “Our main objective with the game engine was to give the player a good frame rate, even in high workload circumstances,” Potages continued. “I think we all hate it when you get a huge drop in the frames per second (FPS) during intense action scenes because of the number of enemies driven by artificial intelligence. So the natural way to keep the FPS rate high is to ensure we take advantage of all the cores. This is definitely not an easy task and requires a lot of refactoring when working on an existing game engine. During development, we also wanted to fully support DX10 and its new features.”
22
It’s Back, Badder than Ever: Bionic Commando
GRIN took advantage of support from Intel to craft the latest version of Bionic Commando, working to target the hardware platforms that would be in the market at the projected time for the software release. “The support from Intel is definitely excellent,” Potages said. “We received some development systems and tools to experiment and benchmark our new features, which is extremely important when you try to develop for next-generation hardware. Game development takes a long time and if you want to fully support hardware that will be out when your game will be released, you need this kind of support very early on.” “Another thing we really appreciate with Intel,” Potages added, “is the constant dialogue we have with them. This includes advice and best practices, but they also analyze and benchmark our engine; it is very useful to have comments from very experienced developers that know the hardware much better than most of game developers!” Among the development tools put into play, Intel VTune Performance Analyzer took center stage, but the GRIN team also took advantage of two threading analysis tools: Intel® Thread Checker and Intel® Thread Profiler. Much of the code optimization took place on Intel-based development systems. GRIN used a wide range of configurations
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
to ensure solid platform support and optimized code that worked well for all of them. “When it comes to benchmarking and optimization,” Potages said, “VTune is definitely the most useful tool we have. We were able to pinpoint several design issues (for instance, we reduced the data flow and redesigned the engine in some areas that were causing far too many cache misses), but we also optimized very specific areas of the code by examining how different implementations behaved. Also, Thread Checker found several possible data races that would have been extremely hard to track otherwise, so this was a very nice time saver.” Potages commented that game developers want to minimize the time spent debugging, implementing tools, benchmarking, and optimizing, in order to focus on new features. It’s difficult to create custom tools from the ground up, so using existing applications saves both tool development time and money. The polish and refinement of Intel VTune Performance Analyzer comes from a long history in which developer feedback has helped improve each successive release. Isolating bugs and design issues early in the development process offers big advantages when scrambling to complete a complex game on schedule.
The performance benefits of all of the tuning and optimization work were noteworthy. As Potages said, “Basically we managed to get to a point where the bottleneck on quad-core machines is the GPU; we’re now able to add features such as additional effects that are only enabled when you get such architectures. But, we didn’t forget dual-cores. The speed increase is big, and we keep optimizing it. For instance, the benchmarks we did for Intel’s presentation at GDC 2008, “Optimizing DirectX on Multi-core Architectures”, showed a 1.76x FPS scale-up between one and two cores.”
Take It to the Limits: Skulltrail Extreme gaming offers a trial-byfire test bed for the most advanced hardware, and with that in mind, Intel’s latest irrepressible gaming machine, Skulltrail, enters the fray like a monster truck rolling over the bleacher barriers and every other obstacle to enter the stadium. The Skulltrail platform includes the first desktop board from Intel with dual processor sockets: the Intel® Desktop Board D5400XS. Fill those sockets with a pair of Intel Core 2 Extreme processors QX9775 and you have full-tilt boogie eight-core processing, as well as support for up to four PCI Express* graphics cards (both NVIDIA SLI* Technology and ATI CrossFireX* Technology components can be used).
23
It’s Back, Badder than Ever: Bionic Commando
The platform also handles up to 8 GB FBE DIMM 800 memory and supports Dolby* Home Theater 7.1-channel audio. To keep the on-screen action flowing smoothly, the Skulltrail data throughput benefits from a 3.2 GHz clock, up to 12 MB of cache, and a 1600 MHz Front Side Bus. The gaming potential is not only hot, but dedicated gamers who rely on overclocking to push the processor performance have found a friend in Blastflow, a subsidiary of the British boutique PC manufacturer Vadim Computers. The Blastflow Tidal Skulltrail SB Block brings watercooling to the platform with a unique copper and acrylic waterblock. Intel has deliberately removed overspeed protection from the platform components, as befits a machine targeting the speed demons of extreme gaming. For those bad boys who go beyond the prescribed limits, however, the Intel disclaimers are unmistakably stern.1 Systems based on the Intel Core 2 Extreme processor QX9775 and Intel Desktop Board D5400X will be offered by several PC manufacturers, including Armari, Boxx Technologies, Digital Storm, Falcon Northwest, Maingear Computers, Puget Custom Computers, Scan Computers, Velocity Micro, Vigor Gaming Computer, Voodoo Computers, @Xi Computer, and others. Only the most devoted gaming enthusiast is likely to plunk down the dollars for a Skulltrail system, but there are other options available for those whose budgets are on a more terrestrial scale. Intel® GMA X3000 technology advances are narrowing the boundaries between discrete game acceleration cards and integrated graphics chipsets. While there will be a certain percentage of high-end games that will require a discrete card for a satisfying gaming experience, many top-tier games perform very respectably on systems equipped with the latest generation integrated graphics architecture from Intel.
About Capcom Capcom began in Japan in 1979 as a manufacturer and distributor of electronic game machines. In 1983 Capcom Co., Ltd was founded and soon built a reputation for introducing cutting-edge technology and software to the video game market. Now an industry leader in the video game industry for 25 years, Capcom’s legacy of historic franchises in home and arcade gaming are testaments to an unparalleled commitment to excellence. Building on its origins as a game machine manufacturer, Capcom is now involved in all areas of the video game industry and has offices in California, England, Germany, France, Hong Kong, Osaka, and Tokyo. To learn more about Capcom, visit www.capcom.com To learn more about Bionic Commando*, visit www.bioniccommando.com
About GRIN GRIN is located in the heart of Stockholm, Sweden, where 115 staff members take game development to the next level. GRIN also has offices in Gothenburg and Barcelona, and a quality assurance studio in Indonesia. All totaled, 220 hard-working and creative individuals are developing games for the major next-generation platforms: Xbox* 360, Playstation* 3 and the PC. To learn more about GRIN Video, visit www.grin.se
•
Warning: altering clock frequency or voltage may (1) reduce system stability and useful life of the system and processor, (2) cause the processor and other components to fail, (3) cause reductions in system performance, (4) cause additional damage, and (5) affect system data integrity. Intel has not tested, and does not warranty, the operation of the processor beyond its specifications.
1
intel® VISUAL adrenaline ISSUE NO. 1, 2008
24
BY lee purcell
Home-Grown Production Pipeline Rivals Proprietary Platforms: Attila the Hun Comes to Life through Cost-Effective Tools Pushing the performance envelope for video effects was uppermost in Gareth Edwards’ mind as he tossed the preconceptions and pricing structures that traditionally hobble video production and created his own accelerated effects studio for the BBC epic, Attila the Hun. Over the last few years, Edwards pioneered a number of visual effects processes and he continues to explore new techniques for using the latest platform innovations creatively. The system platform that he constructed to drive the effects engine—assisted by a friend and technical guru, Dan Goldsmith of Armari Ltd—featured a trio of liquid-cooled workstations, the latest of which features Intel® Xeon® processors and 16 GB of memory. This potent, high-octane processing platform proved equal to the task of producing 250 high-definition (HD) visual effects shots over five months— totaling an incredible two shots a day. Relying heavily on
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
Adobe Production Premium (specifically the Adobe Premiere*, After Effects*, and Photoshop* applications), Edwards took advantage of the multi-threaded applications to drive the effects-processing pipeline to new levels of performance. Rapid feedback during production was achieved by viewing the ongoing timeline of the production from an HD (720p) QuickTime* file in Adobe Premiere Pro*.
Escalating Effects Edwards has built much of his professional reputation around the ability to innovatively manage effects—making small budget projects look as though they cost far more. This talent led him to create increasingly more elaborate productions and eventually gain the nod from the BBC to direct the Attila the Hun drama. As a one-man, one-workstation, effectsprocessing wizard for this project, it was in everyone’s
25
Home-Grown Production Pipeline Rivals Proprietary Platforms
scope and appearance of Attila the Hun. For example, to create a complex battle scene with 30,000 warriors, he filmed four stuntmen fighting and then copied and pasted these characters, with the timing offset, into the background until he achieved the desired effect. Without the many innovations used in this project, Edwards stated, he would not have been able to complete Attila the Hun on time or within the allocated budget.
The Art of Misdirection One technique that Edwards employs effectively is using the art of misdirection to keep the viewer’s eyes focused on certain areas of a frame and away from areas where subtle flaws in the detail or backgrounds might be visible. He relies on the fact that the human brain can take in only a certain amount of information at a time. Adobe After Effects* was used heavily throughout Attila the Hun.
interests for him to find ways to streamline the workflow, both in the tools and the computing platform used. The focus on special effects has propelled his career in ways that even surprise him. After creating the computer graphics for a BBC TV show called Seven Wonders of the Industrial World, he received recognition from staff members for achieving an epic look on a low budget. This led to an opportunity to direct a show himself and his reputation continued to build. “From then on, I plowed all energy into my effects,” Edwards said. “I was given TV shows that were heavy with visual effects to direct. The great thing about doing your own visual effects, the way I am, is that it tends to make your production look like it has twice the money it has, but you can really add to the scope of the project.” “So, say you have a budget of $250,000 for a show,” Edwards continued, “you can make it look like half a million. For your next show, people think you had half a million, so they will trust you for half a million, and then you make that look like a million. Then, people will trust you with a million. I found that it was very quick to grow your budget, because normally it takes years to crawl your way up the ladder and get bigger and bigger projects, but by the time I got to Attila, it was the third project I had directed. I was very lucky, because without visual effects it could have taken me a few decades to get to that position.” The Intel-based platform and Adobe Production Premium spurred some genuinely creative adaptations to expand the
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
“I don’t know exactly what the math is,” Edward said, “but if you add up 1280 × 720 pixels (which is what American HD telly is) there are so many pixels and then 30 of those frames every second. You end up with so many pixels that the human brain just can’t take that information in—it is just not how the human brain works: you can only take in so much at a time. I am a great believer that when you watch a shot, there are finite limits to the amount of information you can take from that shot. I have basically pinned my whole career on this idea.” “If I flash a magazine in front of your face for two or three seconds (which is the length of an average shot in a film), and then pulled it away, you would not be able to recite every single item on that page. You would probably just remember the headline and an item or two. It is like that with shots, as well. Your eye is drawn to a certain place and there are rules about why it is drawn there. It usually goes to the bright areas or to the areas with movement or it goes to areas based on where it was looking in the previous shot.” “The more you can work on your shot in the context of the film,” Edwards continued, “that is, knowing what content is coming before and after and kind of feeling your way through it, the more efficient you can be. With my Adobe pipeline, I have the whole film laid out on a timeline in Premiere. When I work on a shot, I am constantly rendering it back to the timeline, so I can watch it relative to the sequences. For instance, you might do the world’s most amazing painting of a city, but if a horse is walking past in the foreground, everyone looks at the horse. Your brain pulls your eyes toward the horse. The great thing about working in telly as opposed to feature films is that, quite honestly, most people are going to watch your show once. There are not going to be too many people who watch it frame-by-frame for the next twenty
26
Home-Grown Production Pipeline Rivals Proprietary Platforms
years (unless you have got a real hit on your hands). I work on the effects so that people can watch a movie once or twice, thoroughly enjoying it, and not notice anything awry. If you don’t know what has been done to create the effect, most people can look at the same frames over and over and not notice the subtle artifacts. I notice because I know what has been done—I know all the cheats. It is all illusion.”
effects rendering, and audio mastering, benefit from this multi-tasking approach, resulting in better performance and greater responsiveness to operator input through the user interface. As part of Adobe Production Premium, Premiere Pro, which Edwards used for the post-production of Attila the Hun, features a highly threaded program environment well suited to
Boosting Workflow for Rendering and Previewing Nucleo* Pro 2 from GridIron Software added another mechanism for utilizing the available cores in the multi-processor system. Developed as a flexible workflow tool for Adobe After Effects CS3, Nucleo Pro provides a variety of options for managing rendering and previewing work on multi-core or multi-processor workstations. The optimized performance option, in particular, works extremely well with After Effects CS3, dividing and managing tasks among the cores efficiently. The combination of Nucleo Pro 2, After Effects CS3, and the The multi-threaded software design of Adobe Photoshop* CS3 provided custom-built Intel Xeon processor-based responsive image editing during Attila the Hun post-production. workstation cut many tasks down to a fraction of the time required previously. As a rough comparison, Edwards mentions he performed real-time video and audio editing. Multiple threads are used an ad hoc test on a particular segment of video that with his in a variety of ways, including concurrent frame rendering. previous system took about 20 hours to render. “For me,” By rendering multiple frames concurrently—up to the number Edwards said, “if I hit Render and it takes 20 hours to render of cores available on the system platform—overall rendering a shot, this is too long. I’m happy to render overnight, while time can be slashed dramatically. This approach also scales I am asleep, because it is downtime anyway. When it starts extremely well so that as additional processor cores are brought eating back into your day, however, that is going to cost you online, they can be utilized to further enhance performance. money. I launched Nucleo Pro to see the difference and it Other applications in the suite also benefit from threading. used all the different cores within the processors. I think it Adobe Photoshop achieves a performance increase from cut the time down to three or four hours. It was a massive splitting portions of images apart for parallel processing on amount of difference.” individual cores. Similarly, Adobe Audition* 1.5 streamlines audio monitoring, editing, and mastering operations through Adobe Exploits Cores to Good Advantage strong reliance on multi-threading. The digital content creation tools in Adobe’s Production Efficiency in a production pipeline comes down to very Premium CS3 get a substantial performance boost from their real monetary issues and the faster the workflow can be underlying multi-threaded code. The multi-core processing handled, the more cost-effective it is to the producer. power available in Edwards’ custom-built system—a full eight Gareth Edwards demonstrated the validity of using a highcores, four in each of the Intel Xeon processors—can be used performance workstation and Adobe Production Premium to individually divide complex and lengthy operations into CS3 components to deliver exceptional video project results discrete threads, which are run in parallel, greatly reducing on an aggressive schedule that would tax any system. the overall processing time. Data-intensive operations, which in the Adobe CS3 suite include high-resolution image editing,
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
27
A furious battle takes place in front of the green screen background.
The Benefit of Xeon Inside Collaboration between Adobe and Intel during the development of Adobe CS3 resulted in a number of strong enhancements that strengthen the responsiveness of the applications during complex processes and shorten data-intensive operations. These enhancements ready the Adobe CS3 applications to fully exploit the capabilities of the latest Intel processor architectures, including the 45nm technology and 47 new Streaming SIMD Extensions (SSE4) available with the Penryn microarchitecture. The SSE4 instructions are given a workout in a number of functions in Adobe CS3 applications, including the motion module, composite operations, cross dissolves, gamma correction, and color correction. One of the key advantages now is that the results of these features can be viewed in real-time, without the need to render segments from the timeline in order to visualize the effects. The number one effect used in Adobe Premiere—the cross dissolve of two HD MPEG video streams—can be previewed in real-time
because of the optimization work that was accomplished. During the collaborative work, Adobe received guidance from an Intel application engineer, who hand-tuned many of the primary SSE4 functions for detection and execution on Penryn-based and Core 2 microarchitecture platforms, optimizing the code paths. Engineering guidance was also provided to improve threading. Tuning and optimization work relied on the proven stable of Intel® Software Development tools that have won over a generation of programmers, including Intel® VTune™ Performance Analyzer, Intel® Thread Profiler, Intel® Integrated Performance Primitives, and the Intel® C++ compiler. The productivity boosts that have been achieved thanks to the cooperative engineering work between these companies will streamline digital content creation and cut hours from production times.
Taking it on the Road Edwards has started taking time off from the grueling production schedules
to speak at events and share his knowledge of video techniques that can be accomplished with an Intel Xeon processor-based platform and Adobe CS3 tools. In late June, he spoke at the European Education Design and Technology Conference, sponsored by Intel and Adobe. The attendees included top video professionals and the leading design schools and institutions, with a focus on pure technology. “Some of the feedback that I received from this event,” Edwards said, “is that even the teachers and professionals involved in this industry have a very hard time staying up to date with the latest developments. Everything moves so fast—especially the hardware and the software.” The significant advances in homebased production capabilities surprise even some veterans in the industry, many of whom expressed wonder at the current state of workstation power. Edwards said, “I show them they don’t have to be limited anymore. I have always aimed with the work I do to make it as cinematic as I can. Jurassic Park came out around 1993 or so and at the time there was no way you could do that sort of thing on a home computer. You wouldn’t really be able to attempt that stuff for another seven years or so. But now the gap between cinema and content you can create at home has closed to within months rather
Multi-threaded Monster: Machine Specifications The computer workstation that drove this project bears consideration, because its performance contributed to bringing the BBC drama in on budget and on time. The key components included: • Dual-socket motherboard fully populated with Intel® Xeon® 5355 processors, based on 65nm technology, running at 2.66 GHz in a liquid-cooled environment
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
• 16 GB of system RAM • Microsoft Windows XP*, 64-bit • 1-gigabit Ethernet networking • 3.35 terabytes of total network data storage • Dual x16 PCIe graphics support • Software packages that included Adobe Creative Suite* 3 and GridIron Software Nucleo* Pro 2
This particular workstation configuration is well suited to the demands of digital content creation, including features that make it possible to work with larger, more complex projects and to maintain responsive system interactivity while running multiple applications concurrently. Individual application performance is accelerated, particularly for software designed to take advantage of symmetric multi-processing (SMP) and threading. 28
Home-Grown Production Pipeline Rivals Proprietary Platforms
than years. As an example of that principle, I point out that Attila the Hun had 250 visual effects, which is more per hour than Lord of the Rings had. And it was just done by one person. I’m just hoping to open people’s eyes to the many things that have changed since I was in film school and get them excited and involved in the latest developments.”
Establishing a New Path The sophisticated video effects solution and production pipeline, designed around a custom-crafted Intel Xeon processor-based workstation, gave Gareth Edwards the tools he needed to successfully complete Attila the Hun, a complex and challenging digital-content creation project for the BBC using Adobe
CS3 Suite. The path that Edwards has created suggests that future productions can be accomplished in a cost-effective way on the latest generation platforms based on Intel® Core™ microarchitecture, establishing a trend that promises to greatly improve the efficiency and impact of video productions that rely strongly on effects processing.
•
about the Author: lee Purcell Having survived the frenetic energy of Silicon Valley in its heyday, Lee Purcell now writes on high-tech and alternative energy topics from a rural outpost in the Green Mountain State. Thanks to Mesh Communications Group (www.meshgroup.com), through which he does much of his writing, telecommuting has replaced long carbon-spewing drives. Lee blogs on alternative energy topics at lightspeedpub.blogspot.com.
Multi-Threading Goo! A Programmer’s Diary Tommy Refenes of PillowFort matches wits with threading and emerges victorious. BY Tommy Refenes
I’ll admit it. I’m a speed freak. (I tried to get help, but it turns out those clinics by the train station are for something totally different than my addiction.) Simply put, I like what I write to run fast, getting a certain amount of personal joy when something I’ve written runs at over 100 FPS. I’ve always strived to make my code as efficient as possible, but I didn’t realize how little I knew until my previous job. Back then I was an Xbox* 360 engine programmer at a studio working crazy hours to get a game up on XBLA. I was thrown into the world of memory management, cache optimizations, rendering
optimizations, and briefly touched on the land of threading. In May 2006 I left that company to start working on Goo!, a game in which you control big globs of liquid and use those globs to envelop enemy goos. Players have total control over their goo, which means a ton of collision calculations and even more physics calculations. My new job, once again, was an engine programmer, and I knew that to have real-time interactive liquid rendering on the screen at a speed that wouldn’t give me FPS withdrawal, I would need to thread the collision and physics to the extreme.
A small disclaimer: Goo! is my first attempt at multi-threaded programming. I’ve learned everything I know about threading reading MSDN help, a few random Intel documents, and various other resources on threads. I didn’t have the fancy pants books or money to buy said books. So in this article I may miss a few official terms and concepts and I may be totally wrong about some things, but like all programming, it’s all about results.
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
29
Multi-Threading Goo!
Act 1:
I was excited and anxious to get started on the new collision engine.
Goo! began life on a Intel® Pentium® 4 processor 3.0 GHz with Hyper-Threading Technology. Goos are rendered by combining thousands of small blob particles on the screen to make up a height map, which is then passed through filters to make the goos look like liquid or mercury. That’s how they
So began my first attempt at threading. I adopted a philosophy inspired by Ron Popeil (of Ronco fame): “Set it and forget it.” Logically data that isn’t needed immediately is a perfect candidate for a threaded operation. Collision detection fell into this thinking very nicely because most of the time collision results will be the same for several frames before
Power Overwhelming
are rendered now, but in the beginning they were hundreds of smiley faces. Back then, Goo! rendered and calculated around 256 blobs on the screen. Blobs’ collision calculations were culled based on velocity and position, but it was all pushed through the Intel Pentium 4 processor and needless to say the results weren’t great. With the game rendering at around 20 FPS I knew it was time to thread the collision detection. I broke down and purchased an Intel® Pentium® D 965 Extreme Edition processor. The dual 3.73 GHz cores would be more than enough to start building a new multi-threaded collision detection system. Like a five-year old back in 1986 with a brand new Optimus Prime,
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
changing. Therefore, I reasoned, collision could run on a separate thread and be checked every frame to see if the routine had finished, copying over the results when finished, and using them in the current frame’s physics calculations. At the time, this seemed like a great idea. Not waiting for collision detections to finish every frame allowed the engine to move on and continue crunching physics and gameplay calculations with old data, while still maintaining a pretty high frame rate. At this point, Goo! was pushing through around 625 blobs rendering at about 60 FPS. The threading model was simple: A collision thread was created when the game loaded, sleeping until it had something to work on. The main thread copied position and size data for every
blob into a static array the thread could access and then continued on to physics calculations and rendering. The collision thread woke up, saw it had work to do, and proceeded with a very brute force method of figuring out who’s touching who. It then saved this data into an array of indices that corresponded to every blob in the game. Once finished, it posted a flag to the main thread. When the main thread saw that the collision
thread had completed its calculations, it copied the newly calculated collision data over the old data, copied new position and size data, and once again sent it to the collision thread to calculate. At this time, each collision calculation took around 20 ms, which meant that collision data was updated just about every 1.5 frames. With 625 blobs on the screen rendering at around 60 FPS, gameplay could now start development. This is where the first problems began to hit. The goos themselves didn’t really feel like goos. When 625 blobs represent one goo it looks fine, but Goo! is a game where you battle against other goos so having more than one on the screen made the goos look a little chunky. Chunky goos (unless you are talking about GooGoo
30
Multi-Threading Goo!
Clusters) are no good. The solution: more blobs! I expanded the threading model by adding another collision thread and splitting up the work over the two threads. The main thread now broke the collision work up into two pieces, telling collision thread 1 to calculate the first half of the data, and collision thread 2 to calculate the second half. Once the two threads finished, they sent two flags and waited for the main thread to see
that data, copy it over, and send more work. This method yielded some better results, allowing around 900 blobs to be rendered to the screen at 60 FPS. With 900 blobs on the screen goos started looking like goos, and it was time to put the collision to rest for a while and focus on rendering. But as gameplay development progressed, various little unexplained bugs started to pop up: Memory would appear as if it were being accessed after being freed, blobs wouldn’t change ownership properly, and the simulation would occasionally explode. Although these were gamebreaking bugs, they were so infrequent that I didn’t bother with them until about seven months later after some core gameplay was flushed out.
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
Act 2:
Why Act 1 is stupid Now, I can hear those of you familiar with threading screaming, “What the hell were you thinking?” I agree. My threading model worked nicely for this particular application, but it was horribly inefficient and prone to all sorts of problems.
For starters, the model of having two threads sleeping and waiting for work (at least in the way in which they were waiting) was horrible. The threads waited in an infinite loop for work, and if they didn’t find any, they performed a Sleep(0) and then continued waiting for work. At the time, I thought the Sleep(0) would free the processor enough so that other threads running elsewhere on the OS would be scheduled efficiently. Boy, was I wrong. The work check loop had the two processors constantly running at 90 percent to 100 percent capacity— an absolute waste of time. To understand why, picture this scenario. Let’s say you are in charge of five construction workers tasked with paving a road. You know that you will need at least two of them to lay blacktop and at least the remaining three to grade the
road, dig the ditches, and so on. Being the efficient supervisor that you are, you wouldn’t think of allowing the two workers that need to lay the blacktop to sleep in the middle of the road until the blacktop is ready to be laid. No, to work as efficiently as possible, you would put all five of them to work grading and then all five paving. That is basically what the first threading model was doing—allowing construction workers to sleep in the path of the other workers until the blacktop was ready to be laid on the section just prepared by the other workers. The processor was tied up and unable to efficiently schedule threads, which caused the thread queue to build up, which made the overall scheduling and delegation of work much slower than it should have been, which tied up processor resources, which made the game slower. Basically my über-threaded engine was using way too much of the processor in an extremely inefficient way. Because of this threading bottleneck, odd problems began to surface. At times, the controls seemed jumpy— almost as if they weren’t being updated fast enough—and physics became unpredictable because collision data was no longer being sent every frame and a half, but every six or seven frames. By constantly checking for work with barely any rest in between, threads for other operations, such as XACT sound and XInput, could not be scheduled efficiently. Having the worker threads perform a Sleep(1) further slowed down the collision threads. Now besides their already lengthy operation, they added on at least 1 ms before any actual work was done, causing the physics to explode even more. To fix these problems I could have synced the collision to the main thread, but the frame rate would have dropped to below 30 FPS, which when rendering interactive liquid looks horrible. Goo! was in a bad way and needed help.
31 31
Multi-Threading Goo!
Act 3:
How Intel® VTune™ Performance Analyzer hurt my feelings
It was painfully obvious that I needed to rewrite the collision engine. I purchased Intel® VTune™ Performance Analyzer around this time to help analyze the threads and locate the bottleneck. Everything I discovered in Act 2 was a result of the information that VTune analyzer turned up the first time I ran Goo! through it. Although my functions had very low tick counts, there was an obvious problem with the thread queue and my privileged processor time. I was hoping for an easy fix, something along the lines of “Create threads with *inaudible* and your bottleneck will be solved.” No, that wasn’t the case at all. VTune basically told me that “The number of threads waiting closely matches the number of processors on the machine” and that I “should consider
Act 4:
A New Hope While on display at the IGF pavilion, Goo! was running at around 60 FPS at 1920 × 1080 with 1024 blobs on the screen, dropping to around 50 FPS when goos were tightly grouped together. At GDC it had to display at 1024 × 768, and it maintained a healthy, V-Synced 60 FPS. A few Intel software engineers approached me at my booth asking about the game and what I had done to get it to run. I told them that Goo! ran off a custom made multi-threaded collision engine, and they told me I should enter the 2008 Intel Game Demo
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
multi-threading my application.” Obviously VTune analyzer didn’t know my plight and how much I had slaved over my current threading model. Here I had just paid good money for a program to tell me that I needed to thread my already heavily threaded application. But after VTune analyzer and I sat down and talked for a bit, I began to see things her way. VTune was only looking out for me and calling it how she saw it. My thread queue was long (10 or 12 threads I think). My privileged processor time was very high and, as a result, the game was running very inefficiently. I also found that the amount of data being passed to the threads was causing several cache misses and bringing the execution of the code on the thread to a virtual crawl. Sure, my threads were doing a lot of work, but their inefficiencies were slowing down the entire system. The entire situation basically snowballed from a very poor threading model.
The solution? A new, better designed threading model.
contest which judges on how efficiently your game scales from single core to quad core machines. Exciting news, but my game in its current state wouldn’t even run on a single core machine and the scaling from dual to quad core would probably not be significant enough to win or even place in the finals.
had almost two years of experimentation and experience to guide me in creating a more efficient threading model.
Returning from GDC, I began coding a new, much more efficient threading model for Goo!. With VTune analyzer’s blessing, I decided to trash everything I have done up to that point with threading and collision. After destroying the worker threads and eliminating the copying back and forth, I was basically starting back at square one, but I now
Using VTune analyzer I found and optimized some inefficient code and optimized the threads as best I could. I restructured the collision results data both sent to and calculated on the threads to fit into cache better and I optimized some of the collision detection calculations. The remaining issues would have to be addressed later. With the Independent Games Festival (IGF) Audience Award deadline fast approaching, I needed to focus on gameplay. After going through and optimizing some functions to cut down on cache misses and rewriting part of the collision algorithms to be a little more processor friendly, Goo! finally got back above 60 FPS and was ready for display at the IGF Pavilion during the Game Developers Conference (GDC) 2008.
Instead of having threads constantly running, I decided to create threads when I needed them and allow them to expire. So rather than allocating two constantly running threads that would tie up even a quad-core machine, I could create more threads and split the work up according to the number of processor cores, helping to scale better from a dual core to a quad core This method proved to be several times more efficient than waiting or even suspending and resuming threads. Over about two months I rewrote the entire collision engine
32 32
Multi-Threading Goo!
with phenomenal results. Goo!, which used to run at 60 FPS at 1920 x 1080, was now rendering at 100 FPS, dropping to around 80 FPS when goos are clenched tightly. At my old default test resolution of 1280 x 720 it rendered at 140 FPS
EPILOGUE I recently started work again on Goo! to prepare for the Intel Contest Finals, hoping Goo! will be a finalist. The new engine is totally rewritten; not a scrap of code from the old engine exists. I have not built a release version of Goo! in the new engine yet, but the debug version runs at 70 FPS at 1024 × 768 in debug and 190 FPS in standard release build (not final profiled release) with around 1600 blobs on the screen. When I push it to 2048 blobs on the screen
instead of 80 FPS. Having the threads create and expire allowed for much more efficient scheduling over the entire processor, causing the thread queue to go down and, in turn, speeding up the game tremendously.
the frame rate drops to around 35 FPS (which was around the frame rate that the last iteration of the engine ran in final optimized release build) and stays around 100 FPS in standard release. The new threading model is similar to the last one in which I create threads as I need them, but the size of code is down dramatically on the threads, which prevents tons of cache misses and causes the new threads to really pump data out at lightning fast speeds. As I progress in my career, I intend to learn much more about threading—
understanding it as completely and fully as possible. In the meantime, I hope my blunders and obviously bad choices in thread design will inspire and warn those who read this article. Now . . . where’s my GooGoo Cluster?
•
Note: The Intel® Game Demo Contest 2008, sponsored by the Intel® Software Network, is not associated in any way with this Intel® Visual Adrenaline magazine. This article in no way affected the decisions of the judges. Registration for the game demo contest ended on July 1, 2008 and the announcement of the finalists took place on August 1, 2008.
JOIN THE MULTI-THREADING REVOLUTION Empower your game with the performance benefits of multi-core by joining the Intel® Multi-Core Developer Community. Providing technical information, tools, and support from the industry experts, the Intel Multi-Core Developer Community can help you discover how to best develop parallel programs and multi-threaded applications on multi-core and multi-processor platforms. Connect with the experts by visiting http://softwarecommunity.intel.com/communities/multicore
about THE AUTHOR: TOMMY REFENES Tommy Refenes was born June 14, 1981 in the technology-deprived town of Hendersonville, North Carolina. At the age of 11, Tommy’s parents bought him a “laptop” which weighed about 30 pounds, and it was on this computer Tommy started coding in QBasic* and became interested in developing games. In high school, Tommy took classes in Pascal and C++, which further nurtured his interest in programming. While attending North Carolina State, Tommy took contract jobs for a few large .com companies and eventually left the university early to pursue a career in software development. After holding a few web and server development jobs, Tommy decided to sell his house and car and move to the Netherlands to start work at a company as an Xbox* 360 engine programmer. In May 2006 Tommy founded PillowFort and began work on the game, Goo!, which gained an IGF Technical Excellence nomination in 2008.
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
33 33
*
nflict o C n i d l r o W Flor By Sylvia
es
BRINGING BACK SUPERPOWER RIVALRY
Just when you thought the Cold War was over, veteran PC game developer Massive Entertainment brings back the superpowers for nothing short of an epic battle in their latest title World in Conflict*: Soviet Assault—release date, early 2009. As Massive’s arguably best real-time strategy (RTS) game to hit the shelves, World in Conflict moves away from sci-fi and into a “what-if” scenario of the Soviet Union invading Western Europe and the United States. Including historically accurate weaponry, World in Conflict* unleashes the ultimate arsenal of assault rifles, anti-tank weapons, H-bombs, nukes, armed trucks, tanks, planes, and more.
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
34 34
World in Conflict*
But that’s not even the most impressive part. Everything in the game is 100 percent destructible—and I mean everything. “This was something that the designers really wanted in the game and we had to get several different aspects to work together to make it all work,” said Massive’s Technical Director Niklas Westberg. “The result is a mix of physics, pre-animated sequences, a custom solution for ground deformation, and amazing work by our artists. Another cool thing is that the AI reacts to the changes in the environment, so it’s pretty dynamic and makes for an interesting experience.” Interesting to say the least. With 360° of camera control, advanced lighting and physics, unbelievably realistic graphics, and the ability to blow absolutely everything to smithereens, it’s hard to remember that it’s just a game.
NOT YOUR MAMA’S RTS GAME World in Conflict unfolded in the studios of Massive as one of the most ambitious projects they’d ever undertaken. “We had already done two Ground Control* games, and we thought it was about time we did something a little different,” said Massive founder and President
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
Martin Walfisz. “We have a lot of people in the studio who are into the whole military thing, so it felt natural to make a game with a realistic and plausible military setting.” Martin continued, “The game is an action-oriented strategy game where the player picks one of four different roles to take command of infantry, air, armor, or support. We’ve balanced these different roles so they all have their own natural enemies. So in multi-player, players are urged to cooperate with their teammates to win.” Okay—but the Cold War? For those of us who lived through it, the idea of World War III at the time was very real. Playing a game where it actually unfolds like an alternate universe before your eyes is nothing short of brilliant—scary—but brilliant. “The reason we settled for the Cold War setting was that we didn’t want to fall into the same pattern of many other games, with World War II or Middle East conflicts,” Martin explained. “We wanted to do something more unique and decided to jump back two decades to explore what an all-out war between the two superpowers of the time would look like.”
35 35
World in Conflict*
And all-out war we got. As a strategic chess match of ultimate destruction, World in Conflict takes RTS gaming to entirely new and relentless heights. Martin explained, “The biggest differences between World in Conflict and other RTS games are probably our focus on teamwork and the fast pacing. Seeing that we don’t have any base-building or resource management, we can have players go online and join a server in the middle of a match and have a great experience! The drop-in multi-player really sets World in Conflict apart from other strategy games. And in the single-player campaign, we have a very strong story and interesting characters that we get to know in much more detail than traditional strategy games.” Having been originally designed to be a multi-player focused game, World in Conflict evolved quickly to support both single-player and multi-player. “After considering what we could do to make a more interesting campaign, we decided to go all in and take RTS storytelling to a new level. While the two different modes might look alike on the surface, the single-player campaign takes away the collaborative demands to not cramp the player’s own style. The singleplayer missions are heavily based around story and campaign objectives.” So the question is—what on earth goes into the development of an über game like World in Conflict? Everything and then some.
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
MASSIVE TECHNOLOGY World in Conflict was built using Massive’s proprietary Masstech* engine. Developed in 2002 for Ground Control II, Massive has worked on continuously improving their technology for the ultimate gaming experience. “We developed the engine ourselves and we know the purpose of every single line of code,” said Niklas. “So when something breaks—and let’s be honest, it does—we know how to fix it and we don’t have to rely on anyone else’s support. Also, all the features and functionalities of the engine have been tailored for the need of our own games, so our designers and artists have all the things they need to realize their ideas. It really makes things a lot easier for everyone.” Niklas continued with pride as he discussed the Masstech engine, “With every iteration of the engine, we’ve come closer and closer to the vision of a scalable strategy game that people with both old and new machines can play.” Realizing the future of highly threaded games, Massive chose to take their Masstech engine to the realm of multi-core. “We added multi-core support to an engine that was originally built for running on a single-core,” explained Niklas. “We used the extra cores for very specific tasks like physics, particle updates, tree animations, shadowing calculations, and VoIP. In the next revision of the Masstech engine, which is currently in development for our future games, the multi-core support will be rooted deep in the architecture, which will allow for a wide and more thorough use.”
36
World in Conflict*
The graphics in World In Conflict* are arguably some of the best you’ll see anywhere outside of real life— and maybe then, even better. No matter where you stand in the
optimize the game for all kinds of
Physics* engine for World
game (or run for fear of being blown
graphics cards and hardware and, in
in Conflict and World in
up), you’ll find a virtual and beautiful
the case of Intel® Graphics, we have
Conflict: Soviet Assault,”
reality—and that’s up close or from
made an extra effort in testing and
Niklas said. “The process
a distance. Smoke, liquid, light—it’s
optimizing the game to make sure
in choosing it was very
all done so superbly, I wonder what
we get top performance,” explained
easy. I had some prior
Massive could possibly do next to
Niklas. “The integrated graphics cards
experience with Havok
trump it. Broadly developing their
are becoming more and more popular,
from an earlier project
game to work with as many graphics
so they are an important sector for us.”
and I knew that the
cards as possible was critical to making
quality of the engine
the game stunning on just about any
Also critical to the graphics experience
and the support was
system. “Our ambition has been to
was their choice of physics engine.
world class.”
“We’re currently using the Havok
Developing for multi-core without the right tools can be rough. “I could talk for hours about issues with multi-core development,” Niklas said. “It’s difficult to write efficient multicore code and it’s difficult to debug it. It all comes down to data organization. Managing your data well and keeping the code straightforward becomes more important than anything else.” To help get them through some of those painful areas, Massive developers used Intel® VTune™ Performance Analyzer to hunt for inefficiencies in code. “It definitely made our lives easier!”
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
37 37
World in Conflict*
“We want ed to hav e a strong the comm online pla unity toge tform to k ther, and that didn’ eep we wante t get in th d a system e way for really wan what mos t—to quic t players kly log in and play t he game.” — Martin
Walfisz, M a
ssive fou
nder and
Presiden t
THE MASSIVE COMMUNITY World in Conflict appeals to single-player and multi-player audiences. But multi-player mode provides a truly rich and integrated gaming experience as you game with fellow clan or community members. Enter in Massive’s Massgate* multi-player server system. “Massgate is the result of years of research, design, and coding,” Martin said. “We wanted to have a strong online platform to keep the community together, and we wanted a system that didn’t get in the way for what most players really want—to quickly log in and play the game. We wanted to ensure that players could chat, run clans, play clan matches and leaderboard matches, or just drop in for a quick casual game. With the great feedback we’ve gotten from the community and the press, it really seems that we’ve succeeded.” The strange and wonderful thing is—this game hooks you emotionally. The cast of characters in World in Conflict are
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
disarming and may even bring an occasional tear to the eye. With an exceptional soundtrack, stunning visuals, and the amazing voice of Hollywood actor Alec Baldwin, World in Conflict draws you in mercilessly and keeps you coming back for more. “It’s been incredible to see the community evolve around the game and be really involved in developing it further,” said Martin. “With all the different mod tools we’ve released, it’s really impressive to see how engaged and creative gamers out there can be.” So what could possibly be next for Massive? It’s top secret of course. But I’ll tell you this—after playing World in Conflict, I know it’ll be nothing short of absolute brilliance.
•
38
RESOURCES Intel® Software Intel’s heightened focus on visual computing and graphics processing is complemented by software development products, graphics chipsets, technical expertise, and developer-oriented resources. Keep up with the activities of Intel’s Visual Computing Software Division through www.intel.com/software/visualadrenaline.
Dig deeper and explore any of the following resources related to the topics covered in this issue of Intel® Visual Adrenaline magazine: Opening an Architecture to Creative Development: Softimage Goes Custom
It’s Back, Badder than Ever: Bionic Commando* To learn more about GRIN Video, visit www.grin.se.
To learn more about the capabilities of Softimage|XSI* and ICE, go to www.softimage.com/products/xsi/.
To learn more about Bionic Commando*, visit www.bioniccommando.com.
Or, join the Softimage XSI Community at www.xsibase.com/index.php.
For more information on the Skulltrail platform, visit softwareblogs.intel.com/2008/02/21/intel-desktop-board-d5400xs/.
Unlocking the Potential of Graphics Processing: Technology Transfer at Its Finest To express an interest in graphics architecture research at the university level, send an e-mail to us at lrb.sdk@intel.com detailing your current research directions and background. To learn more about university research grant programs at Intel, go to techresearch.intel.com/articles/None/1440.htm.
Epic’s Unreal Engine* Stops Playing Around: Non-Game Uses Open New Opportunities To learn more about Unreal Technology components available from Epic Games, visit www.unrealtechnology.com. For mod and developer support, go to the Unreal Developer Network at udn.epicgames.com/Main/WebHome.html.
Genetic Genius To learn more about CG artist José María Andrés, visit www.alzhem.com/index.htm. To learn from José’s tutorials, visit www.alzhem.com/sub1/tutorials.htm.
Home-Grown Production Pipeline Rivals Proprietary Platforms: Attila the Hun Comes to Life through Cost-Effective Tools For more information about digital content creation using Adobe Creative Suite* 3 Production Premium, visit www.adobe.com/products/creativesuite/production/. To learn about how Gareth Edwards accomplished special effects in Attila the Hun, go to www.fxguide.com/article463.html.
Multi-Threading Goo!—Optimizing PillowFort’s Game To learn the finalists for each category of the Game Demo Contest 2008, announced on August 1, 2008, go to softwarecontests.intel.com/gamedemo/index.php. For more about the latest activities at PillowFort Games, visit www.pillowfortgames.com.
World in Conflict* To learn more about Havok Physics* engine that Massive Entertainment used for World in Conflict*, visit www.havok.com. To learn more about Intel® Graphics, visit www.intel.com/products/graphics/.
• Subscribe to the Intel® Software Insight magazine: www.intel.com/go/softwaredispatch • Sign up for the Intel® Software Partner Program, available to software companies: www.intel.com/partner • Tap into multi-core resources: www.intel.com/software/mcdeveloper • Find out more about Intel® Software Network: www.intel.com/software • Explore Intel® Software Development Products: www.intel.com/software/products • Build your knowledge base with books from Intel® Press: www.intel.com/intelpress/ • Find online and classroom training courses from Intel® Software College: www.intel.com/software/college • Interact with a lively community of individuals in the Intel® Graphics Developer Community: www.intel.com/software/graphics/
To sign up for an on-going subscription (gratis) of the Intel® Visual Adrenaline magazine, as well as the Intel® Software Dispatch visual computing edition e-mail program, go to: www.intelsoftwaregraphics.com.
Intel® VISUAL adrenaline ISSUE NO. 1, 2008
39
To subscribe to IntelÂŽ Visual Adrenaline, go to www.intelsoftwaregraphics.com
Intel does not make any representations or warranties whatsoever regarding quality, reliability, functionality, or compatibility of third-party vendors and their devices. All products, dates, and plans are based on current expectations and subject to change without notice. Intel, Intel logo, Intel Core, Pentium, VTune, and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. *Other names and brands may be claimed as the property of others. Copyright 2008. Intel Corporation. All rights reserved. 07/08/SM/CS/PP/15k 320325-001US
Š