Publication
VOLUME 4 • ISSUE 4 • APRIL 2007 • $8.95 • www.stpmag.com
Memory Leaks in Your Java Apps? Sun Tells How to Plug Them
Stress Testing for Boundaries, Limits and Tolerances Appetize Your Apps With a GUI Bug Hunt Grooving With the Gremlins: Tips for Embracing Complexity
: ST ES BE CTIC A st ion PR Te mat to Au
A
4AKE THE
HANDCUFFS OFF
QUALITY ASSURANCE
Empirix gives you the freedom to test your way. Tired of being held captive by proprietary scripting? Empirix offers a suite of testing solutions that allow you to take your QA initiatives wherever you like. Visit us at booth # 25 at Software Test & Performance Spring 2007 to receive our white paper, Lowering Switching Costs for Load Testing Software, and let Empirix set you free.
www.empirix.com/freedom
The days of
‘Play with it until it breaks’ are over!
Introducing: ®
TestTrack TCM The ultimate tool for test case planning, execution, and tracking. How can you ship with confidence if you don’t have the tools in place to document, repeat, and quantify your testing effort? The fact is, you can’t. TestTrack TCM can help. In TestTrack TCM you have the tool you need to write and manage thousands of test cases, select sets of tests to run against builds, and process the pass/fail results using your development workflow. With TestTrack TCM driving your QA process, you’ll know what has been tested, what hasn't, and how much effort remains to ship a quality product. Deliver with the confidence only achieved with a well-planned testing effort.
• Ensure all steps are executed, and in the same order, for more consistent testing. • Know instantly which test cases have been executed, what your coverage is, and how much testing remains. • Track test case execution times to calculate how much time is required to test your applications. • Streamline the QA > Fix > Re-test cycle by pushing test failures immediately into the defect management workflow. • Cost-effectively implement an auditable quality assurance program.
Download your FREE fully functional evaluation software now at www.seapine.com/stptcmr or call 1-888-683-6456. ©2006 Seapine Software, Inc. Seapine TestTrack and the Seapine ©2007 Seapine Software, Inc. Seapine TestTrack andare thetrademarks Seapine logo logo of Seapine Software, Inc. All Rights Reserved. are trademarks of Seapine Software, Inc. All Rights Reserved.
VOLUME 4 • ISSUE 4 • APRIL 2007
Contents
12
A
Publication
COV ER STORY In Search of Boundaries: You Are Entering the No-Code Zone
In the final article in our three-part series on boundaries, you’ll learn how to use stress tests to find behavioral boundaries, limits and tolerances. By Rob Sabourin
22
Baff led by Java App Memory Leaks?
Is brain drain driving you batty? Let Sun’s gurus show you how to stem the flow—before you drown in rivers of frozen bytecode. By Gregg Sporar and A. Sundararajan
30
Depar t ments
Learning How To Groove With The Gremlins Lost your keys, got a flat tire, suffered through an unexplained system crash? You can’t fight entropy. Instead, learn to embrace the inevitable: chaos and confusion. By Linda J. Burrs
7 • Editorial When it rains, it pours: a cautionary tale about the dangers of letting leaks go.
8 • Contributors Get to know this month’s experts and the best practices they preach.
35
9 • Letters
Bug-Hunting In Your GUI Apps
Now it’s your turn to tell us where to go.
Graphical user interfaces are the most complex and interactive software tools designed—so they demand a richer testing process that will give your UI a workout. By Dan Rubel and Phil Quitslund
New products for developers and testers.
10 • Out of the Box 40 • Best Practices A postcard from Cairo, where the grass isn’t always greener. By Geoff Koch
42 • Future Test Trump Homeland Security—and sort your software threats by color! By I.B. Phoolen
APRIL 2007
www.stpmag.com •
5
Ed Notes VOLUME 4 • ISSUE 4 • APRIL 2007 Editor Edward J. Correia +1-631-421-4158 x100 ecorreia@bzmedia.com
EDITORIAL Editorial Director Alan Zeichick +1-650-359-4763 alan@bzmedia.com
Copy Editor Laurie O’Connell loconnell@bzmedia.com
Contributing Editor Geoff Koch koch.geoff@gmail.com
ART & PRODUCTION Art Director LuAnn T. Palazzo lpalazzo@bzmedia.com
Art /Production Assistant Erin Broadhurst ebroadhurst@bzmedia.com
SALES & MARKETING Publisher
Ted Bahr +1-631-421-4158 x101 ted@bzmedia.com Associate Publisher
List Services
David Karp +1-631-421-4158 x102 dkarp@bzmedia.com
Agnes Vanek +1-631-421-4158 x111 avanek@bzmedia.com
Advertising Traffic
Reprints
Phyllis Oakes +1-631-421-4158 x115 poakes@bzmedia.com
Lisa Abelson +1-516-379-7097 labelson@bzmedia.com
Marketing Manager
Accounting
Marilyn Daly +1-631-421-4158 x118 mdaly@bzmedia.com
Viena Isaray +1-631-421-4158 x110 visaray@bzmedia.com
READER SERVICE Director of Circulation
Customer Service/
Agnes Vanek +1-631-421-4158 x111 avanek@bzmedia.com
Subscriptions
+1-847-763-9692 stpmag@halldata.com
Cover Photo Illustration by The Design Diva, NY
President Ted Bahr Executive Vice President Alan Zeichick
BZ Media LLC 7 High Street, Suite 407 Huntington, NY 11743 +1-631-421-4158 fax +1-631-421-4130 www.bzmedia.com info@bzmedia.com
Software Test & Performance (ISSN- #1548-3460) is published monthly by BZ Media LLC, 7 High St. Suite 407, Huntington, NY, 11743. Periodicals postage paid at Huntington, NY and additional offices. Software Test & Performance is a registered trademark of BZ Media LLC. All contents copyrighted 2007 BZ Media LLC. All rights reserved. The price of a one year subscription is US $49.95, $69.95 in Canada, $99.95 elsewhere. POSTMASTER: Send changes of address to Software Test & Performance, PO Box 2169, Skokie, IL 60076. Software Test & Performance Subscribers Services may be reached at stpmag@halldata.com or by calling 1-847-763-9692.
APRIL 2007
For Lack of A Gallon... I heard a chilling story yesas if it had been kicked or terday about a water leak somehow damaged. She that caused tens of thougot out of the car and sands of dollars in damwent for a closer inspecage, left a family—includtion. Nothing could preing a newborn baby— pare her for what she was homeless, and impacted about to see; for what had the lives of that family and been taking place inside perhaps dozens of others her new home for untold for years to come. All for days and nights. the lack of a few gallons of One can only imagine home heating oil. her horror as she peeked Edward J. Correia After a year of house inside the house. Ceilings hunting, Amy and Jason had finally and walls had fallen and collapsed, found their dream home. They fretted reduced to piles of mud. Hardwood over the timing: If you’re ever sold floors had twisted and warped, one house to buy another, you know destroyed beyond recognition. The what a delicate dance it can be—like front door too, had succumbed. stacking dominos. Everything was all Soaked in 12 inches of water for weeks set. They had a buyer lined up for on end, the house, along with her their current home. Two others had dreams, was shattered. previously backed out; we’re in a soA few gallons of fuel, more or less, called “real estate down-turn” on would have prevented this disaster. Long Island at the moment. The seller Amy and Jason are faced with a terriof their dream home had already ble decision: Risk a substandard repair moved out and away. Financing was job by the divorced couple’s insurance arranged. company, or back out and start over. There was just one problem: The They’ve decided to risk it. Fortunately, couple selling the dream home was in Amy’s mother lives nearby, but they’ll the midst of a nasty divorce. It seems have to sue to recoup their storage that the husband, bitter with spite, fees for their furniture and other stopped paying the fuel bill. When the belongings, and for any damage not deliveries stopped coming, the tank covered by the insurance. ran dry. When the tank ran dry, the With houses, fixing leaks early is burner had nothing to burn. When the cheapest way to go. The same is the burner had no fuel, there was true for software. Even in a managed no more heat to keep out the record environment such as Java, leaks can— cold. The February Northeast chill and often do—rob apps of performcame in and froze the pipes, and water ance over time and eventually cause ran uncontrolled in the empty house them to crumble. for weeks. This month, we’re fortunate to With the closing only two weeks offer a terrific tutorial by two of Sun’s away, Amy decided to take a drive by top programmers on finding and the new house. The weather was repairing memory leaks in Java apps. warmer now, even though it was still It’s the first installment of a two-part early March. It was a beautiful springseries covering leaks in young and like day, but something was wrong. tenured generations. For more about The ornate front door, made of carved the authors, see the contributors’ hardwood and leaded glass, appeared page. And please, fix leaks early. ý www.stpmag.com •
7
Contributors Dedicated to helping companies succeed at building teams and high-quality software solutions, ROBERT SABOURIN and his Montreal-based consultancy, Amibug.com, count among his customers academic publisher Addison-Wesley, IT training company Logisil, McGill University and smart-card maker Gemplus. Sabourin concludes his three-part series on boundary testing with our cover story, in which he describes how he has applied stress testing to expose boundaries in customer applications found when the limits and tolerances of a system are exercised. The article begins on page 12.
A pair of development experts from Sun Microsystems explain how memory leaks might occur in your Java applications, and share their specific tools and techniques for finding and eliminating them. Technical evangelist on the NetBeans project, Sun’s GREGG SPORAR (shown) has been a software developer for more than 20 years and has used Java since 1998. Elevenyear veteran developer A. SUNDARARAJAN is a member of Sun’s Java core technologies team (part of the JDK development team). The first article in this two-part series begins on page 22. In “Learning to Groove With the Gremlins,” DR. LINDA J. BURRS takes a whimsical look at the chaos that exists all around us and offers serious advice on how to cope. Gremlins start on page 30. With multiple degrees in organizational leadership and management, Dr. Burrs has spent more than 25 years bringing her dynamic approach to coaching, training and team building to corporations and professionals. Clients of her Step Up to Success! consulting firm include law firms, technology organizations, educators, business professionals, leadership groups and nonprofits. DAN RUBEL (left) is chief technology officer at Instantiations, and has been designing object-oriented software since 1987. He and PHIL QUITSLUND, architect of the company’s Window Tester GUI test automation tool, provide techniques for GUI test mechanization specifically for several popular commercial and open-source tools beginning on page 35. Both men are considered experts in their respective fields. Rubel was instrumental in the design and development of several successful coding products and frameworks, and co-authored “Eclipse: Building Commercial-Quality Plug-ins” (AddisonWesley, 2004). Quitslund has been active in the Eclipse research community since 2002 and has built numerous tools and extensions.
TO CONTACT AN AUTHOR, please send e-mail to feedback@bzmedia.com. APRIL 2007
Feedback AGILE ARTILLERY FIRES BACK Regarding Andrew Binstock’s comment that agilists put “lots of premium on people not processes, and they like the world in which specs and regulations do not interfere with the development process. The problem is that if you don't follow specs and regs, chaos ultimately ensues. We all know the populist views of agilists. It would be good if one day they would discuss the role of rules, regs, requirements and discipline.” The common misunderstanding is that agilists ignore processes. They don't. They put the emphasis on people over processes. That does not mean ignoring processes. Just consider this question: Who defines and maintains the processes? Read James Womack and Daniel Jones’ “Lean Thinking” (Revised edition, Free Press, 2003) and Mary and Tom Poppendieck’s “Lean Software Development” (Addison-Wesley, 2003) and you should get closer to the agile/lean point of view. Gudlaugur Egilsson Reykjavík, Iceland
NO QUICK ANSWERS Regarding Edward J. Correia’s T&QA newsletter article “A Failure to Communicate” (Jan. 30, 2007): Even if communication is an issue, I think this discussion went a little bit too fast into one direction only, and everybody seemed to have a quick answer what went wrong. This reminds me of airplane crashes, where the specialists need two years to find out what went wrong while the media knows it already after the first hour. In my career, I have seen many reasons where situations like the one I have described may occur. Here are only a few examples: Unawareness of the impact of changes. Every time a programmer passes a module to testing, he has made a trade-off decision between scope, schedule and quality. He has made a micro-level decision that the code is good enough, and that further improvements in quality are not justified by their cost in either a longer schedule or in delivering fewer features. These micro-level decisions may be made inconsistently (Mike Cohn in Better Software magazine, Jan. 2006). The same applies to any party involved in the SDLC. Discipline. The best process is of no valAPRIL 2007
TEAMWORK AND AUTOMATION I enjoyed Murtada Elfahal’s article “Teamwork Is the Future of Testing” (Feb. 2007). We are embarked in a test automation project for applications written in Centura 1.5 that have nonsupported controls. IBM ROBOT failed to support Centura 1.5. Sameh Zeid
Ellenwood, Ga.
ue if management does not feel responsible to make sure people comply with defined standards. Uncontrolled growth. If access to test systems is not restricted to a limited number of responsible people, uncontrolled growth rules the system. Pressure. A company that’s under pressure to deliver software in time is probably more addicted to taking hazards. Lack of resources. It can be quite a challenge if a test system is used for several purposes at the same time or if it is used by many different testers spread all over the world. Torsten Zelger Zurich, Switzerland
DO IT YOURSELF Regarding Edward J. Correia’s “A Failure to Communicate,” I gave up on getting notified by developers and sys admins when changes were happening. I wrote test code that retrieved version information for every test and checked for changes. If you can, you should check the date, time and file size of executables and config files for every test. Tom Tulinsky Los Angeles, Calif.
CLASSICAL AND CLINICAL I found Edward Correia’s article “The Importance of Being (and Testing in) Earnest” (T&QA newsletter, Feb. 6, 2007) a wonderful classical and clinical description of the subject matter. Over the last 30 years, this has been a repeated presentation that has served to bolster (and not allow us to forget) the pragmatic foundation of verification and validation.
While I would have hoped to see this immortalized in pervasive practice, I must resign myself to seeing it as a reminder. Thanks for repeating what others continue to evangelize about. Jerry Durant Winter Springs, Fla.
TALENT RECOGNITION Regarding Edward J. Correia’s “Team Leaders Behaving Badly” (T&QA newsletter, Feb. 20, 2007), good leaders must recognize talent, especially when employees demonstrate the ability to meet new challenges critical to the survival of key projects. Leaders must develop key relevant skills in their existing staff whenever possible, much like a good coach will keep his promising new quarterbacks on the bench until they become seasoned and develop the necessary maturity and stamina to become major contributors. Good leaders must recognize that consistent performers occasionally need a change of venue and allow them to groom newer employees to take over key tasks in their absence. It is not enough to have a successful team. Good managers must always be on the lookout for new opportunities that can be quickly handled by members of the staff. Adam Chornesky Colesville, Md. FEEDBACK: Letters should include the writer’s name, city, state, company affiliation, e-mail address and daytime phone number. Send your thoughts to feedback@bzmedia.com. Letters become the property of BZ Media and may be edited for space and style. www.stpmag.com •
9
Out of t he Box
IBM Goes End-to-End On SOA Quality Do your business analysts and QA team have a folder full of napkins and other scraps with scrawled flowcharts waiting to become applications? Even if they’re not so primitive, chances are that when the IT department finally gets around to crafting those applications, your functional testing process includes returning to your flowcharts and manually comparing IT’s work to your requirements. Manually: by hand. Easing the job a bit would be a BPEL tool, which by now many analysts might be using for the modeling part. But such orchestration tools offer limited tie-in to the development side, and according to IBM, offer no connection to functional testing. Claiming to solve that problem, IBM in March unveiled the SOA Quality Management Portfolio, a series of new and enhanced tools and services for end-to-end SOA testing and QA, part of which it claims can turn BPEL code into use cases that perform automated functional testing. Performing this feat is Tester for SOA Quality. “The bulk of the job is automating,” asserted Dave Locke, IBM Rational’s director of offerings marketing, in a recent phone interview. “For business analysts using BPEL tools to illustrate their processes, we can read BPEL in as a testing scenario and test against it. We’re the only company in the world to do that,” he claimed. Further automation, he said, includes the ability to analyze a service’s interface and, for those lacking graphical UI characteristics, create the framework for testing them anyway. “To test a component, you have to feed it something and get something back. Having your developers build code for that testing is a time-consuming process. With this capability, we generate a shell for a GUI-less component, with all the calls and test scripts to ensure that that component really works as advertised.” Supported components include those written to Web services standards such as SOAP,
10
• Software Test & Performance
HTTP and UDDI, Locke said. “The component has to conform to an SOA approach for us to test it.” Tester for SOA Quality began shipping on March 27, along with Performance Tester Extension for SOA Quality, an extension for Rational Performance Tester that Locke said can quickly pinpoint trouble spots in an SOA. Locke described a scenario typical in
A new extension for Rational Performance Tester adds SOA to the tool's capabilities, shown here testing a banking service.
many companies in which the IT services group monitors deployed applications for performance and other issues. “Someone says, ‘Man, this application is slow,’ which begins the manual process of finding the problem” by individually analyzing every piece of the architecture. “Now by clicking, people can tell developers which application is slow and link back all the way to a given line of code. This can reduce the
amount of overhead in trying to find a problem.” The solution also is useful for diagnosing problems in components residing in other organizations, Locke said. “If your application is vectoring off to TRW to get a credit report [for example], developers can easily isolate the TRW component, and you can call TRW and tell them that their component is causing a 43-millisecond delay in your application.” Delivering the new response-time and application management dashboards is IBM’s Tivoli Composite Application Manager, which also has been updated for the solution.
The tools, combined with new SOA consulting services to be available sometime in the second quarter of this year, are part of a larger strategy by IBM to provide a testing and QA solution that encompasses all aspects of the SOA life cycle. “Most vendors are focused on testing—on errors per line of code,” said Fill Bowen, IBM’s SOA marketing manager. “We’re offering a solution from concept and modeling of a service—and even before that, with business goals—all the way through to assembly, deployment and management, once it’s in production.” APRIL 2007
OpenMake Gives Eclipse Its Mojo OpenMake Software in March contributed its Mojo build-automation tool to the Eclipse Foundation, giving development teams a free alternative on par with IBM’s BuildForge and Urbancode’s AnthillPro, the company said. Even small development projects can involve hundreds of build, test and deployment scripts, and other files. “With a simple download, development teams can get their build processes under control,” said OpenMake CTO Steve Taylor in a news release. Mojo can generate builds for most major languages, including C/C++, Java and Visual Basic. It also works with Rational Application Developer and Visual Studio.
Lighthouse a Beacon Of Project Freedom Artifact Software has unveiled Lighthouse Pro, a free version of its project management system that includes the ability to store and track tasks, requirements and project resources, and to manage tests, time sheets, defects and issues, change requests, documents, workflows and alerts. “Despite our dependency on software, we still see alarmingly high rates of software projects failing to meet budget, schedule, scope and quality goals,” said Artifact CEO Mark Wesker in a news release. Also in the free version are rolebased security, reports and dashboards. The customizable and extensible Lighthouse Premium is available as a subscription service.
EdenTree Turns Labs Into Gardens EdenTree Technologies in March removed the fig leaves from Configuration Manager, a system setup tool that it claims can shrink the multi-hour job of setting and resetting test bed systems into an automated task of about 15 minutes. “Configuration complexity is stealing away hundreds of hours of productive test-lab time as lab engineers are stuck for days setting up tests to support expandAPRIL 2007
ed product lines and multiple operating systems,” said EdenTree president and CEO Jay Oyakawa. The appliance-based Configuration Manager communicates with Linux, Unix and Windows computers and storage arrays over SAN or LAN connections, and automates the setup, archiving and restoration of the machines to any desired state or version of application, database, operating system, test script or data.
McCabe Boosts Application IQ ALM tools maker McCabe Software recently unveiled a trio of applications it says afford key developers a complete suite for quality life-cycle management. Dubbed the IQ Editions, the tools target QA professionals with code complexity and coverage analysis capabilities. IQ Developers Edition “visualizes the architecture” and measures the quality of an application, according to a company news release, and compares it against more than 125 metrics. The tool “highlights the most complex areas of the code base,” helping teams determine how best to allocate development resources. IQ Test Team Edition delivers test coverage data, including McCabe’s own Cyclomatic path, MDCD (boolean), branch and lines of code. It also locates redundant code and can analyze and track code containing specific data sets. An Enterprise Edition contains all functionality in both IQ Editions and adds reporting and secure, Web-enabled test data collection.
GUIdancer: Dancing With Eclipse 3.2 GUIdancer 1.2, the latest version of Bredex’s coding-free testing tool for Swing applications, now supports Eclipse 3.2. Better late than never; Eclipse 3.2 (Calisto) has been shipping since June 2006. GUIdancer 1.2 also now includes predeveloped test cases that can be modified and reused. “There are always certain core test cases that every test needs,” said Hans Brede, the German company’s managing director. “They are, in real terms, the building blocks of a test. Because GUIdancer works on the principle of
reusability, testers will find that even the initial test specification will be quicker.” The new version also includes improvements in the installation and configuration process and enhancements to its access to logs and test layouts.
Keeping a Watch For Prospective Fires In AppScan Enterprise 5, security tools maker Watchfire has added the ability to perform vulnerability checks on Webbased applications simply by pointing the tool at them. Introduced in version 5 is QuickScan, itself a Web-based app that nontechnical staff can use to analyze running Web apps, identify possible security problems and display them as a list of developer tasks. Code scanning has been enhanced, the company said, to find vulnerabilities in AJAX, advanced JavaScript and Flash. AppScan Enterprise 5 also now integrates with Watchfire’s computer-based training system, allowing users to tap into the company’s self-service tutorials, and managers to view enrollment information, course completion rates and test results.
Cut-Rate Test Manager Released If you’re looking for a utility to help manage your test scripts and other test-related documents without a huge expense, a new solution has emerged from across the pond. Softedition.net, a French company, in March unveiled Test Manager 2.04, a tool for Linux and Windows that it claims mimics HP’s TestDirector at a fraction of the cost. According to managing director Charles Similia, Test Manager helps testers create and structure test documents, build and adhere to test plans, deliver reports, help bind requirements and size, and create and organize test campaigns. The tool integrates with popular functional testing tools including SilkTest, TestPartner and WinRunner. Pricing starts at €1490. Send product announcements to stpnews@bzmedia.com www.stpmag.com •
11
Using Stress Tests to Find Behavioral Boundaries, Limits and Tolerances By Rob Sabourin
s a child, I broke a lot of toys. My parents warned me that if I kept doing that, I wouldn’t have much fun, but I persisted. And guess what? My parents were wrong!
A
My favorite destructive experiences focused on a model electric train set. I’d run the train and try to see how fast it would go. I’d rearrange tracks to see how long the train would run without jumping off. This was great fun. I changed the train configuration— box cars before flat cars followed by passenger cars, and of course the caboose near the front instead of the back. I changed the track configurations, with different long, straight stretches followed by twists and curves with the occasional figure eight. I eventually had the insight to see what would happen when I changed the track layout from a twodimensional pattern into a system with hills and valleys, slopes and gradients. I also changed the train speed: slow when it should be fast; fast when it should be slow. My experiments in train configuration led to some interesting experiences. It was fun to find out that the train would run just fine until I crossed a threshold with one of my variables. This occurred only in the beginning. Then I realized that I could have more than one train on the same track system at the same time. I could create dramatic situations by varying the parameters of both trains in the same environment to try to avoid or precipitate collisions. I now realize that my early experiments in determining the boundaries of a train system were the precursor to my later hunt for boundaries in the behavior of Rob Sabourin is president of a software consultancy, Amibug .com, in Quebec.
12
• Software Test & Performance
r Koh aron by A h p a togr Pho
www.stpmag.com •
13
NO-CODE ZONE
complex software systems. TABLE 1: FENCED IN Though my aim was scientific, Time Received Processed I used a testing approach that ended up breaking a lot of 1 100 80 trains and track—much to my 2 200 160 parents’ chagrin. 3 300 240 I find traditional domain4 400 320 testing approaches, such as 5 500 400 equivalence partitioning, useful when finding boundaries 6 600 480 related to processing, computa7 700 560 tions, data and reporting vari8 800 640 ables. Several techniques are 9 900 720 available to help us discover 10 1,000 800 those factors using both analytic and exploratory approaches. But I’ve had to apply very different testments, which explore system behavior ing approaches to identify boundaries by varying load and environmental conrelated to a system’s behavior. ditions while allowing observation of Software engineering projects have system characteristics and parameters. many sources of bugs. For each source To illustrate the concept of stress we can discover different boundaries. I testing experiments, I’ll relate a releuse techniques associated with stress vant example from a recent project I testing experiments to help me undercompleted. For a critical series of teststand the boundaries of system behaving experiments to determine limitaior. If I can identify these boundaries, I tions of system behavior, I studied elecan help my clients get a better feel for ments of the system’s behavior while the limitations of systems before varying the load in a controlled test they’re deployed. environment. I was concerned with how the system used processing capacStress Testing Experiments ity and available memory resources as To find the limitations of a software syswell as the time to complete processtem, I often use stress testing experiing of typical transactions.
FIG. 1: TEST COMPOUND
Transaction Simulator Transaction SimulatorTransaction Transaction Simulator Transaction Transaction Simulator Simulator Simulator
Generator Generator
Event Log
Data Server
Event Event Log Log
Event Event Log Log
Generator Generator Business Logic Business Logic
Web Web
Generator Generator
Console Console Generator Generator
14
• Software Test & Performance
The system being tested was a centralized examinaQueued tion server used to control interactive sessions of exams 20 completed by delegates as 40 part of a professional certifi60 cation program. Several hun80 dred concurrent users would 100 run exams at a predetermined time slot from many 120 geographically disparate loca140 tions. The exams were held 160 in many different time zones 180 with overlaps between start 200 and finish times. Exam duration was about six hours, with several well-defined sections with wellknown start and stop times. During the examination, transactions were generated from delegates’ workstations, which were PCs running Windows XP Professional with current versions of Internet Explorer. A client layer was implemented with a blend of Web-based HTTP and JavaScript, as well as some light dynamic objects. Every 15 seconds, a transaction was generated from the workstation to synchronize the exam, update questions and responses, and validate authentication. The servers ran a traditional multi-tier Web-based architecture using some operating system services to perform continuous data replication. Data was transparently replicated. The servers were running Windows 2000 using a blend of Microsoft and open-source Web software. Many different programming language environments were used. The data layer was implemented using a series of MS SQL applications and stored procedures. Load balancing was implemented using router-based approaches. My customer was concerned about whether the system could handle the amount of load over the prescribed period of time. They had suffered through some bad experiences in the past owing to server failures when roughly 20 concurrent transactions occurred. So I had to find the limits of the system in place. Could it handle the load and pattern of work expected? What were the boundaries? I set up a series of experiments to identify these boundaries. Figure 1 illustrates the type of testing environment I used to find them. The testing environment includes three major components: a load genAPRIL 2007
NO-CODE ZONE
erator, a transaction simulator and a system console.
FIG. 2: TIME RUNNING OUT
Load was generated via scripts simulating examination sessions using commercial load-testing tools. Some highpower PCs (fast CPU and plenty of RAM) were used for the testing period, each of which could simulate approximately 100 virtual EXAM sessions (each running in a separate thread).
Transaction Time (ms)
Load Generator
18000 16000 14000 12000 10000 8000 6000 4000 2000 0
Time
0
Transaction Simulator A workstation was reserved for the purpose of simulating typical EXAM transactions in a continuous loop as load was generated on other computers. Commercial test-automation tools were used to run scripts a number of times, keeping a detailed log of all HTTP events. Logged data used included all load times and times to accomplish transactions.
50
100
150
200
Load (Users)
at which the response is outside of the requirements. The point on the curve at which the response time changes from constant to linear is a critical boundary of system behavior.
Processing Capacity
I also studied the amount of available processing capacity. I observed that while the behavior starts to become nonlinear after about 25 concurrent users, the processor did not saturate Available Memory Boundary until about 80 concurrent users (see In the same experiment, I studied availFigure 4, page 16). In my experience, System Console able memory resources on the server as it’s critical to know the point at which Microsoft Performance Monitor was load was varied. The boundary of memthe processor reaches saturation. If used to monitor server performance ory-usage behavior change occurred at available processing resources are during load generation. Sampled vala similar point as that associated with exhausted, transactions take longer to ues included available memory and transaction time. As shown in Figure 3 process and will wait longer in a queue processor usage. The Web transaction (see page 16), the available memory before processing can even begin. IIS servers were configured to generchanges from constant to an approxiBy performing stress testing experiate detailed logs of all HTTP events. mate linear drop after about 25 conments, I can find boundaries in The results are depicted in Figures 2, 3 current users. resource usage and system behavior. I and 4. model system usage in a In studying the transac- TABLE 2: MOVEMENT RESTRICTIONS manner consistent with what tion time while varying load, is expected on the live sysExperiment Distribution Arrival Average CPU Capacity I discovered an interesting tem. Finding boundaries Number Pattern Transactions Available boundary. The point of the using stress testing experi001 Uniform Random 100 tps 90 (light) graph at which the transacments can help minimize Uniform Random 100 tps 60 (normal) 002 tion time starts increasing is risks of deploying underpow003 Uniform Random 100 tps 40 (heavy) about 30 users. From one to ered solutions. Results pro004 Uniform Random 200 tps 90 (light) 30 users, the time to comvide a better understanding plete a transaction is relativeof how system resources are 005 Uniform Random 200 tps 60 (normal) ly constant. After 30 users, actually used. With these 200 tps 40 (heavy) Uniform Random 006 the time to process a transacresults, performance engi400 tps 90 (light) Uniform Random 007 tion increases linearly. neers can tune systems for Uniform Random 400 tps 60 (normal) 008 The client established optimal performance, and thresholds of acceptable syssoftware engineers can per009 Uniform Random 400 tps 40 (heavy) tem response times based on form what-if analysis for 010 Normal Random 100 tps 90 (light) their experiences running architecture, design and 011 Normal Random 100 tps 60 (normal) such interactive examination deployment alternatives. 012 Normal Random 100 tps 40 (heavy) sessions. The acceptable 90 (light) 200 tps Normal Random 013 Boundary Trends From behavior of the system was set Real Usage Scenarios at less than 12 seconds for 014 Normal Random 200 tps 60 (normal) Figure 5 (see page 18) repretransactions to complete. This 200 tps 40 (heavy) Normal Random 015 sents the processor usage of a threshold occurred at roughly Normal Random 400 tps 90 (light) 016 system with a realistic simula125 concurrent users. The 017 Normal Random 400 tps 60 (normal) tion of several hundred users point on the curve at which over a 24-hour period. The the required threshold is 018 Normal Random 400 tps 40 (heavy) experiment aimed to deterpassed represents a boundary APRIL 2007
www.stpmag.com •
15
NO-CODE ZONE
cumstances in which the pending transaction queue is full.
Memory (MB)
FIG. 3: UNPLEASANT MEMORY
The Bucket of Water: Arrival And Processing
3456 3454 3452 3450 3448 3446 3444 3442 3440 3438
Memory
0
50
100
150
200
Load (Users)
mine processing capacity. Load was modeled with arrival times of transactions matching those realistically expected from the client workstations during live operation. Past data and log files from operational systems were used to confirm the model’s validity. Note that the processor usage reached a peak value of 50 percent, and generally falls between 30 and 40 percent. Studying processor usage over time with a constant load helps us identify a different type of system usage boundary. In this example, we see plenty of available CPU capacity that doesn’t vary much over time. The CPU usage “trend” boundary is at about 35 percent, indicating how much of the system will be used during live operation. Some of my customers won’t release systems if the processor usage exceeds 80 percent during live operation over prolonged periods of time.
remain small. If transactions arrive at a rate faster than the system can handle them, we have a potential problem. Several important boundaries can be found by looking at related rate problems: When will the process queue be full? When will database cache memory be full? When will virtual memory have to start using secondary storage? Transactions arrive with a pattern based on the type of transaction, the timing of arrival and the distribution of transactions. Transaction processing depends on the type of transactions and what else is going on in the system while a transaction is being processed. I like to use the tools of differential calculus to help me to find boundaries related to the behavior of the arrival and processing processes. If I can model the arrival process as some type of mathematical function and the processing as another, I can find the cir-
A related rate problem would examine the rate at which transactions go into the queue as well as the rate at which they go out of the queue. The queue is like a bucket holding water. The arrival process is like the flow of water into the bucket. System processing is like the flow of water out of the bucket. We want to find out when the bucket fills up as we vary the input (or source) functions as well as the output (or sink) functions. Say that a(t) models the arrival process, and b(t) models the system processing. Then we can define the depth of the queue as a relationship c(t). At any time, c(t) is related to a(t) and b(t). The rate of change of c(t) is related to the rate of change of a(t) and b(t). I’m interested in discovering some characteristics of c(t). When is c(t) at its maximum value? When is c(t) at a threshold value? When does the behavior of c(t) change? Here’s a simple example. Let’s say I have a queue that can store a maximum of 200 pending transactions, a uniform arrival process that delivers 100 transactions per second, and a system that can process transactions at a rate of 80 transactions per second. Assuming that it starts empty, when does the queue become full? (See Table 1, page 14). When viewed graphically as in Figure 6 (see page 18), the queue reaches the threshold level after 1,000 transactions have arrived and 800 have been processed. The rate of increase
Related Rate Boundaries
16
• Software Test & Performance
FIG. 4: CRAMPED CONDITIONS 120 100 Percent
By blending some analysis with stress testing experiments, we can better understand boundaries related to the transaction arrival and processing behavior of systems under test. Often I need to find boundaries related to processing in transactional systems used on mainframe and multitier architectures. The boundaries of concern are related to how queues fill up with pending transactions before they can be processed. I create models of how transaction events arrive and how they are subsequently processed. If a system can process transactions faster than they arrive, the queue of pending transactions will always
80 Processor Utilization
60 40 20 0 0
50
100
150
200
Load (Users)
APRIL 2007
Memory Loss Affecting Your Multi-Threaded, Multi-Process Application?
Download your free 15-day trial version of MemoryScape, the newest memory debugger from TotalView Technologies. Provided by TotalView Technologies, the leader in multi-core debugging software, MemoryScape is specifically designed to address the unique memory debugging challenges that exist in complex applications. MemoryScape supports C, C++ and Fortran on Linux, UNIX and Mac OS X, and is a focused, efficient and intuitive memory debugger that helps you quickly understand how your program is using memory as well as identify and resolve memory problems such as memory leaks and memory allocation errors. MemoryScape utilizes wizard-based tools, provides collaboration facilities and does not require instrumentation, relinking, or rebuilding. Its graphical and intuitive user interface simplifies memory debugging throughout the memory debugging process and has been proven to find critical memory errors up to 500% quicker. Before you forget, go to our web site at www.totalviewtech.com/memoryscape or call 1-800-856-3766 for more information. Š 2007 TotalView Technologies, LLC TotalView is a registered trademark of TotalView Technologies, LLC. All other names are trademarks of their respective holders.
NO-CODE ZONE
Surprisingly Straightforward: Boundaries in Quality Factors
Processor Use (%)
FIG. 5: ONE DAY AT A TIME
I’ve found boundaries related to almost every aspect of software testing I’ve ever confronted. When the characteristics of a system are hard to quantify, boundaries are often surprisingly straightforward. Methods used to quantify quality factors are often exposed by boundaries. Usability. For example, how can we quantify elements of usability? Usability is the quality factor related to the ability of users to achieve their goals using the software being tested. It’s challenging to write requirements about usability, and even tougher to confirm that your software meets usability requirements. In his book “Competitive Engineering” (Butterworth-Heinemann, 2005), Tom Gilb teaches a powerful approach to quantifying quality factors. The technique, which uses a tool called Planguage, allows quality factors to be described using attributes such as scale, meter and goal. Scale is the unit of measure, meter is the method of measurement, and goal is the required target value. By looking at quality factors from the point of view of scale,
120 100 80 60 40 20 0 -20 6
12
18
24
Hours
18
• Software Test & Performance
changes the number of transactions generated per unit of time. Note that the behavior of each curve is slightly different, but there are two clear trends. In A, B and C, the behavior is consistent, and in E and F, the behavior is consistent. This tells us that there is clearly a behavior boundary observed in experiments A, B and C at a load of about 50 transactions per second. At this point, the slope of the transaction-processing time curve and system behavior changes. It may be related to queues or buffers filling, or cases in which cache memory starts swapping to secondary storage.
FIG. 6: A QUEUE QUAGMIRE 1200 1000 Transactions
of the queue is the difference of the rate of arrival and processing: in this case, c(t) = a(t) – b(t). In real systems, arrival rates aren’t simple linear equations, and processing rates depend on many factors. To determine when the queue will fill, you must study the relationships between different arrival and processing rates. To find these boundaries, I set up stress testing experiments controlling arrival and processing rates. For each experiment, I use one transaction-generating model and one processing model. For example, I could have three possible processing models depending on what other activities are going on in the system: a typical model, a harsh model and a light model. I could control access to system resources using other programs that consume system resources while testing takes place. I call these applications “resource hogs.” I consume different processing capacity for each experiment. For the arrival process, I control the distribution pattern and load. I conduct a series of experiments studying transaction timing while load is applied with the target pattern. An example summary of these experiments is shown in Table 2 (see page 15). By varying the distribution, transaction rates and available processing capacity, I can identify whether the application processing can keep up with the arrival of new transactions. I usually study transaction processing time, but I can also look at other aspects of the system, especially if I have access to performance monitor information or database log files. Figure 7 (see page 20) plots results from a series of experiments in which each curve represents a different configuration and the variation in load
800
Received Processed Queued
600 400 200 0 1
2
3
4
5
6
7
8
9
10
Seconds
The experiments E and F didn’t help us identify or expose any boundaries of system behavior. If the experiments continued at a load-transaction rate of greater than 200 per second, I would probably have identified a point at which the curve slope changed. It’s critical to note that the experiment explored behavior ranges of importance to the project. During the range of interest there was no clear change in slope for E and F. Stress testing experiments don’t always identify boundaries in behavior.
meter and goal, we can quantify the objective and establish means of testing to validate their existence or absence in the software we’re testing. Let’s investigate how usability can be quantified with Planguage. Suppose we want to make sure the user interface is intuitive for our target market. Imagine a meter in which we set up a controlled environment where target users attempt to complete a transaction with the software under test. The user is given access to the system, including online help and norAPRIL 2007
NO-CODE ZONE
mal documentation. While the user attempts to perform the task, we can measure the number of times that documentation or online help are referenced. We can set a goal of an average of one documentation access for 10 transactions for users of the target demographic, with suitable subjectmatter expertise and experience. With a series of controlled experiments, we can determine the trend in the number of documentation accesses per transaction to see if it is below the target. Our target is on the boundary. If the average is below the target, our software is usable. If the average is above the target, our software is not usable. Quantifying the quality factor of usability has enabled us to study boundaries in usability. Maintainability. Another challenging quality factor is maintainability. Software systems require maintenance. Developers need to quickly adapt software, add features, change system behavior or fix bugs in an efficient manner. Planguage allows us to quantify maintainability. For example, we could measure the time required to fix bugs during system testing. The meter would be the average time to fix bugs found in system testing. The scale would be the hours from the time the issue was triaged to the time of redelivery to the testing team. The goal could be set to an average of four hours. Now during system test, we could monitor the bug-fix time as a measure of our maintainability. If it takes too long, maintainability is below the target threshold. Goals in quality factors become the upper boundary of our testing targets. We also can examine the problem from the opposite point of view. Sometimes the goal is not known, and we must reverse-engineer it. For example, if we don’t know what the acceptable target value is, we can set up a series of experiments to determine what the actual results are and then assess whether they acceptable or not. Once we’ve quantified them, we can start setting measurable goals for improvement. This is akin to the old gasoline commercials that demonstrated that a car could go farther on a tank of gasoline by using a better-quality fuel. The key then and now: A meter existed, a scale was defined, and experiments determined the current value. The goal can then be established to improve the measured value. APRIL 2007
B
OUNDARIES INSPIRED BY DISASTER RESONANT FREQUENCY BOUNDARIES AND ‘GALLOPING GERTIE’ Sometimes, perhaps too often, it takes a tragedy to open our eyes to a dangerous intersection or stretch of road, a deadly escalator defect, or a bridge that will not withstand the forces of nature. One such bridge was Tacoma Narrows, which famously collapsed in 1940.Throughout its brief lifespan, locals came to call it “Galloping Gertie” because of its swaying motion whenever the wind blew. On the morning of Nov. 7, 1940, just four months after it was completed, the winds proved too much for Galloping Gertie, and it swayed and twisted until it fell apart. What made Gertie famous was that the entire incident was captured on film by a local cameraman (http://www.youtube.com/watch?v=P0Fi1VcbpAI); the chilling images stand testament to the need for simulation testing. Structural engineers studied the design to determine the cause. Finally, conclusive evidence was presented to demonstrate that the pattern of oscillation caused by the wind conditions hit the critical point that led to the failure.The critical boundary was not the strength of the wind, but its pattern, and the structure’s resulting oscillation.The bridge was found to have a critical resonant frequency, and when the load pattern matched that frequency, the bridge structure became unstable. In software testing, I use the notion of frequency domain boundaries to help me find system limits based on the pattern and frequency of transactions, not just the number of concurrent patterns. Stress testing experiments can be used to determine if system behavior is stable, given different distribution patterns.The load on a real system can be considered a combination of many different component loads, each operating at a different frequency. Varying the frequency patterns of the load being generated can help you understand the frequency response of the system. I can often identify boundaries in the frequency domain, and determine if there‘s a critical frequency in which the system behavior will change and become nonlinear.
SCALABILITY BOUNDARIES AND THE QUEBEC CITY BRIDGE In 1907, the Quebec City Bridge collapsed during construction, taking the lives of 75 workers. In 1916, during reconstruction of a second bridge on the site, 11 workers were killed. In 1919, a third and successful attempt of the bridge was opened. With a center span of 549 meters (1,800 feet), it remains the world’s longest cantilevered bridge and is considered a major feat of engineering. The problem with the first two attempts to build the Quebec City Bridge was a matter of scalability. How wide a span could be crossed with a cantilevered bridge? The bridge failed in 1907 and 1916 because the design was being extended beyond what was physically possible. Metal properties and construction techniques failed on wider bridges. If there’s a message to software engineers from this, it’s to not assume that a project can handle increased capacity merely by adding more hardware. To this day, members of the Canadian Council of Professional Engineers wear an iron ring that symbolically shares the materials of the Quebec City Bridge as a constant reminder of the social responsibilities inherent in civil engineering. Perhaps software testers should have the same sort of steady reminder of their responsibility to those influenced by their software.
www.stpmag.com •
19
NO-CODE ZONE
The Wal-Mart Approach Sometimes I’m asked to define the minimum system configurations recommended for desktop software. As a testing professional, I’d love the marketing department to give me a specific list of characteristics I could use to run a simple test. I’d set up a machine with the minimal configuration and test the application to ensure that all basic functionality worked and that transaction processing time was within reasonable tolerances. Unfortunately, my marketing departments have almost always reversed the question on me. They want to know the minimum hardware requirement of the software under test. In essence, they’re asking me to find the lower boundary of acceptable systems to run the application on. To find this boundary, I use a simple but effective technique that I call the Wal-Mart approach. I take configurations of typical desktop systems sold off the shelf at Wal-Mart for the general market over the past several years. I run a test on several sample systems to see if the application can be installed and run acceptably. If an older system fails, I try a younger one until I get to the year in which the behavior of the application is acceptable. The WalMart PC configuration of the year in question becomes the resulting lower boundary threshold. I could have experimented with processors, memory configurations and storage capacities to hunt down the optimal and minimal configurations, but the Wal-Mart strategy is cheaper and quicker. I also identify real systems that consumers actually have in their homes.
we employ information about how customers will use the software to establish operational profiles and define usage scenarios to help us build meaningful test cases. I often work with Jason DeSimone, a contract tester who’s particularly gifted in bug-confirmation testing and respected by testing and development teams. Having tested software since age 15, he’s often called upon to con-
•
When working on a tight deadline, you must assess how reasonable and realistic your testing is.
• firm that bugs have really been fixed; to confirm bugs that often he didn’t find or identify in the first place. I tried to understand why Jason was so highly prized as a bug confirmation tester. I always suspected that he excelled because he was technically savvy and could talk with developers about all sorts of nitty-gritty technical issues. I was surprised to find out that instead, the source of Jason’s success had to do with his notions of boundaries. Here’s Jason’s approach to confirming fixed bugs: 1. Make sure the bug can’t be repeated as described in the bug report.
20
• Software Test & Performance
To Boldly Go… In the conclusion of this three-part series, we’ve learned that all phases and workflows of software engineering are constrained by some sort of boundary or limitation, and that boundary testing is a critical part of the software quality-assurance experience. With boundary testing, we “boldly go where no one has gone before”—or perhaps more accurately, where no application should go. As we learn the capabilities, extremes and limits of the software we’re testing, boundaries are truly the final frontier! ý
FIG. 7: BEHAVIOR BOUNDARIES
Boundaries of Credibility And Absurdity
50000.00 40000.00 Time (ms)
There’s one last quality factor boundary I want to share: credibility and absurdity. When we test software, we usually have insufficient time to validate many aspects. While working with these tight deadlines and limited resources, it’s important to assess how reasonable and realistic our testing is. If we focus on absurd interactions between systems and components, we’ll probably expose many interesting problems, but we won’t be sure if they’ll show up in the real world. At the same time,
2. Identify a normal usage scenario in which a user would have encountered the bug. 3. Confirm that on the baseline, bugged system, the scenario would really expose the bug. 4. Run the scenario on the fixed software to make sure the bug isn’t present and to validate it hasn’t just shifted. 5. Run the scenario repeatedly with variations, changing steps, sequencing, orders and values. Jason would stop creating scenario mutations only when he felt that the scenario was so absurd that he would never be able to convince a developer to fix a bug in it. He was identifying a subjective boundary, but one that he and developers could easily agree upon. He could chat with the developers to confirm that he had crossed the line. If the bug didn’t reappear, it was confirmed. If it did, he could easily convince developers that it was important enough to fix.
A B
30000.00
C 20000.00
E F
10000.00 0.00 0
50 100 150 Load (Transactions Per Second)
200
APRIL 2007
Plug the Leaks In Your Apps Before
Baffled By They Turn Into
Rivers Of Frozen Bytecode. Sun’s
Java Gurus Show You How.
22
• Software Test & Performance
APRIL 2007
By Gregg Sporar and A. Sundararajan
D
oes your Java application become slower as it runs? The cause could be a memory leak. In Java applications, memory leaks
usually end up causing performance problems. At the least, they decrease the CPU time available for running your application, slowing its ability to respond. In a worst-case scenario, your application stops responding altogether. Solving memory leak problems in Java applications requires a variety of tools and techniques. There is no sin-
gle solution—different techniques are appropriate for different situations.
What Is a Memory Leak?
an OutOfMemoryError will stop responding to requests. A common approach for resolving an OutOfMemoryError is to restart the application and use a JVM option to specify a larger heap. This is reasonable during development, when you’re determining the heap requirements of the application. The heap should grow as your application processes requests. As the load declines, the heap usage should
Brain Drain In Your The Java programming language doesn’t require the developer to directly manage memory allocations—it doesn’t even allow it. Instead, programs use the new operator to allocate objects on
Photograph by Yanik Chauvin
Java Apps?
APRIL 2007
a heap that the Java Virtual Machine (JVM) manages at runtime, and objects are never explicitly deallocated or removed from that heap. When an application no longer references a heap object, the JVM removes it using a process called garbage collection (for more information, see the article “Reference Objects and Garbage Collection” at http://java.sun.com /developer/technicalArticles/ALT /RefObj/). There are a variety of garbage collection algorithms, and all tend to use more CPU time as the heap becomes full. Therefore, memory usage by your Java application has a direct impact on performance because time spent by the JVM doing garbage collection is not available for running your code. If your program tries to hold onto too many objects at one time, it can fill up the heap. When the heap becomes completely full, subsequent attempts to allocate objects will result in an OutOfMemoryError thrown by the JVM. Few Java programs are designed to recover gracefully from this error. Usually, a Java application that causes
decline with it, as the garbage collector removes objects that are no longer in use. If the load on your application decreases but its heap usage does not, you have a problem. A frequent cause is code that allocates objects, uses them, and then inadvertently holds one or more references to them, preventing the JVM’s garbage collector from removing objects that are no longer in use. Each inadvertent reference is called a memory leak. Most modern JVMs have commandline parameters for specifying heap size (see Table 1). Unfortunately, increasing the heap size is unlikely to solve this problem. Given a larger heap, applications that have memory leaks will fill the extra memory, which serves only to delay the OutOfMemory Error. An example memory leak is shown in Listing 1. The processRequest() method creates an object that does some processing. It caches this object Gregg Sporar and A. Sundararajan are veteran software developers working for Sun Microsystems. www.stpmag.com •
23
JAVA STOP-LEAK
the garbage collection, respectively. The number in parentheses is the total amount of free heap space, and the time value is the seconds needed to perform the garbage collection. The word Full indicates that a full garbage collection was done, meaning that in addition to the young generation, unreferenced objects also were removed from the tenured generation. To see information about the permanent generation, add the command-line flag:
LISTING 1: FIND THE LEAK private static Map<Integer, WorkDetails> myMap = new HashMap<Integer, WorkDetails>(); public void processRequest(UserRequest request) { // do some processing... WorkDetails workObject = new WorkDetails(request); myMap.put(request.getID(), workObject); workObject.doTheWork(); // oops, forgot to remove the work object from the Map, // a memory leak will occur.... return; }
-XX:+PrintGCDetails.
by storing a reference to it in a Map, and then does its processing and returns. A leak is created because the processRequest() method doesn’t remove the reference from the Map; nor does any other method. The object it created is referred to by the Map and therefore can’t be removed by garbage collection This example shows a common anti-pattern: Objects are added to Collections or Maps and are never removed. Object references from static variables are another frequent source of memory leaks. If the objects are large enough, the result in either case can be an OutOfMemoryError.
How Is the Heap Allocated? The JVM divides its heap into three parts, or generations. The young and tenured generations are used to hold the objects directly allocated in your application with the new operator. The permanent generation is used to hold class and method objects, and certain strings. Since the permanent generation isn’t used to hold objects that are directly allocated by an application’s code, the generic term heap usually refers to the combination of the young and tenured generations. Unless otherwise noted, the remainder of this article will use the term heap to refer to the combination of the young and tenured generations. References to a JVM or Java Development Kit (JDK) refer to Sun’s reference implementations, version 1.4.2 or higher, unless otherwise noted. For more information about the JVM heap generations and garbage collection options, refer to the article “Tuning Garbage Collection With the 1.4.2. Java Virtual Machine” at http://java.sun.com /docs/hotspot/gc1.4.2.
24
• Software Test & Performance
Watching the Heap If your application’s performance degrades over time or if an OutOfMemoryError is thrown, the first step is to monitor the heap and the permanent generation as your application runs. With JDK 1.4.2, the
The jvmstat project provides additional tools. Changes to the heap and the permanent generation of the JVM can be seen by using the visualgc tool, which offers a graphical view of the information provided by -verbose:gc, along with additional information.
TABLE 1: CONTROL THE HEAP Usage
Option -Xmx
Maximum size for the heap (young + tenured generations). Example: -Xmx128m sets the maximum size to 128 megabytes.
-Xms
Initial size for the heap (young + tenured generations). Example: -Xms64m sets the initial size to 64 megabytes.
-XX:MaxPermSize
Maximum size for the permanent generation. Example: -XX:MaxPermSize=96m sets the maximum size to 96 megabytes.
-XX:PermSize
Initial size for the permanent generation. Example: -XX:PermSize=32m sets the initial size to 32 megabytes.
OutOfMemoryError doesn’t specify whether it was the heap or the permanent generation that was exhausted. Newer versions of the JDK do include specific error messages. Regardless of JDK version, several monitoring techniques are available. The simplest is using a JVM command-line option: -verbose:gc
This causes the JVM to display information about the heap after each garbage collection. For example: [GC 30896K->30848K(36972K), 0.0002380 secs] [Full GC 30848K->30847K(36972K), 0.0641433 secs]
The values on the left and right of the arrow are the combined size of all objects on the heap before and after
Further information about jvmstat is available on its home page (http:// java.sun.com/performance/jvmstat/). JConsole is another helpful utility. The JConsole application was originally developed as an experimental part of the Java Management Extensions (JMX) project, but as of JDK 5 is included in the JDK. To attach JConsole to the JVM running your application, you must specify the flag on the command line when you start your application (the command-line flag is not required with JDK 6): -Dcom.sun.management.jmxremote
An example of the JConsole heap display is shown in Figure 1. For more information on JConsole, refer to the online help (http://java.sun.com /javase/6/docs/technotes/tools/index APRIL 2007
JAVA STOP-LEAK
.html#jconsole). Finally, the jmap utility has a -histo option that displays a histogram of all objects on the heap. The jmap utility was originally available only on Linux and Solaris, but as of JDK 6 it’s also available on Microsoft Windows. Find more information about jmap at http://java.sun.com/javase/6/docs /technotes/tools/share/jmap.html. These tools impose almost no runtime overhead on the JVM. They’re easy to install, configure and use, and provide a high-level view of memory usage as your application runs. Unfortunately, they’re limited to displaying high-level information, and it’s sometimes difficult to correlate the output from these tools with actions performed by your application. For more information, refer to the article “Troubleshooting Java SE” http:// java.sun.com/javase/6/webnotes /trouble. To research deeper into a memory problem beyond the information these tools provide, additional techniques and tools are needed.
Tracking Down Heap Memory Leaks
FIG. 1: JCONSOLE SHOWS MEMORY LEAK
shot, consider two questions: which objects that should have been removed by the JVM’s garbage collector are taking up space, and why are those objects still on the heap? In other words, which objects are holding references that prevent the unneeded objects from being garbage-collected? To find the answers, you need a
Coding problems that lead to heap memory leaks are frequently subtle. Sometimes the application must run for many hours before the leak becomes noticeable. FIG. 2: JHAT DOCUMENTS THE HEAP A specific load or data set may be necessary to trigger the errant code. Even the Java language specification itself can introduce complications. For example, instances of classes that override finalize() can’t be garbage-collected until their finalize() method is run, and there’s no guarantee it ever will. For more information, see “How to Handle Java Finalization’s Memory-Retention Issues” at h t t p : / / w w w. d e v x . c o m / J a v a /Article/30192/1954?pf =true. There are two approaches for tracking down a heap memory leak. You can get a snapshot of the heap and inspect its contents, or you can use instrumentation to watch memory allocation trends as your application runs. The choice depends on a variety of factors, which are discussed below.
Using Heap Inspection When inspecting a heap snapAPRIL 2007
tool that can show you the contents of the heap and the relationships between the objects on the heap. The jhat utility included with JDK 6 provides that capability. To use jhat, you must obtain a snapshot file (or dump) of the memory used by your application. If the JVM is reporting an OutOfMemoryError, add: -XX:+HeapDumpOnOutOfMemoryError to the command line that starts your application. This option is supported by the most recent updates of JDK 1.4.2, JDK 5 and JDK 6, and creates a file that ends with .hprof. For more information on jhat, refer to http://java.sun.com /javase/6/docs/technotes/tools /share /jhat.html. If you don’t want to wait for an OutOfMemoryError to occur, you must force the creation of the memory snapshot file. With JDK 6, simply use the new -dump option of the jmap utility to do this. With earlier versions of the JDK, specify: Xrunhprof:heap=dump,format=b
on the command line that starts your application. A Ctrl-\ (or Ctrl-Break in Windows) at the console of your application will end the application and create www.stpmag.com •
25
JAVA STOP-LEAK
FIG. 3: LIKE-AGED OBJECTS, ALL IS WELL
the snapshot file. If no console is available, use the kill command (on Solaris or Linux systems) or a tool such as StackTrace(http://www.adaptj.com/root /main/stacktrace). The jhat utility has only one required parameter: the name of the snapshot file. Note that the jhat included with JDK 6 can read snapshot files created by older versions of the JVM. The jhat utility contains a Web server so you use a browser to access its user interface. It reads a snapshot file and then, similar to a database server, allows queries on the data in the file. It runs on port 7000 by default, so after it starts, specify http://localhost:7000 in your browser. The code in Listing 1 was used in a
small sample application that ran out of heap space, and a snapshot file was created: The initial view provided by jhat is a list of the application’s classes followed by a list of queries. The Show heap histogram query is a good starting place because it has an entry for each class. Each entry contains class name, the number of instances and the total heap space used by those instances. For this example application, the entry for int[ ] arrays (called class [I) is at the top. The histogram shows that the int[ ] arrays are using just over 56 megabytes (mb), by far the largest amount of heap space used by any class’s objects. If you click an entry for a class, jhat displays information about that class. The most important feature on the
FIG. 4: SHORT-LIVED OBJECTS ALSO GEL
26
• Software Test & Performance
class’s page is the ability to click a link and see information about each instance of the class. The list of instances for int[ ] arrays is: [I@0x22e81b88 (64 bytes) [I@0x24292410 (4194312 bytes) [I@0x256924d8 (4194312 bytes) [I@0x23292330 (4194312 bytes) [I@0x24692420 (4194312 bytes) [I@0x25a924e8 (4194312 bytes) [I@0x23692368 (4194312 bytes) [I@0x24a92430 (4194312 bytes) [I@0x25e924f8 (4194312 bytes) [I@0x22e824d8 (1032 bytes) [I@0x23a923a0 (4194312 bytes) [I@0x22e822a0 (48 bytes) [I@0x22e83a28 (48 bytes) [I@0x24e924b8 (4194312 bytes) [I@0x22e839b8 (48 bytes) [I@0x26292660 (4194312 bytes) [I@0x22e82090 (520 bytes) [I@0x22e90ee8 (4194312 bytes) [I@0x23e923b0 (4194312 bytes) [I@0x252924c8 (4194312 bytes) [I@0x22e81cb8 (116 bytes)
Of the 21 entries, 13 are relatively large, each taking up 4mb of memory. Clicking on an object instance in jhat leads to a screen similar to what is shown in Figure 2, which is for one of those 4mb int[ ] arrays. To help you figure out why an object is on the heap, jhat provides a list of other objects that are referring to the object. In this example, only one object holds a reference, and it’s a WorkDetails object. If you use jhat to follow the chain of references, you’ll eventually discover one or more references from a root set object. The root set objects are the starting point for the garbage collector—any object reachable from a root set object is not a candidate for garbage collection. For a full definition of root set, refer to the Memory Management Glossary at http://www.memory management.org/glossary. In this example, the root set object is the static field myMap, which holds a reference to a HashMap, which in turn holds multiple references to HashMap$Entry objects, each of which holds a reference to a WorkDetails object. Until that chain of references is broken, the int[ ] arrays referred to by each WorkDetails object can’t be garbage-collected. In addition to that, a wide variety of tools (commercial and open source) can inspect a heap snapshot. Some provide graphical views of the relationships between the objects on the heap. Most calculate helpful values, APRIL 2007
JAVA STOP-LEAK
including a measurement of the amount of heap memory that could be garbage-collected if a particular reference were null. Many of the tools can gather and inspect a heap snapshot while your application is running. The most sophisticated tools provide capabilities to compare two heap snapshots so that you can see what changed from one point in time to the next. For more information on Java profiling tools, refer to the list available at http://www.javaperformancetuning .com/tools/index.shtml.
FIG. 5: LEAKS OCCUR WHEN AGES SWELL
Using Instrumentation In some situations, inspecting the heap isn’t a practical approach or doesn’t provide enough information. The simple example shown above had only a few classes and a relatively small heap. Java applications in deployment frequently have class and instance counts that are much larger. Examining all the class and object information for a large Java application can be time-consuming, particularly if you’re not familiar with the application’s source code. As an example, consider the instances of the WorkDetails class. You can use jhat with the example shown above to see that there are 22 instances of the WorkDetails class on the heap. If you’re familiar with the application’s source code, you know that this is suspicious— WorkDetails objects should be short-lived because they’re intended for processing only a single request. If you aren’t familiar with the source code, you have no way to know whether the number of class instances indicates a potential memory leak or not. This problem is especially difficult when the memory leaks are caused by references to commonly used classes such as String. Using instrumentation can help you identify which object instances are the most likely cause of memory leaks. Instead of showing relationships between objects, instrumentation tools allow you to watch your code’s memory allocation trends as your application runs. For each class, there are several measurements: the number of instances created, the number of
instances that are still on the heap, the amount of heap space used, and the generation count. The first three are self-explanatory, but generation count requires an explanation. The generation count doesn’t refer to the various heap generations (young, tenured and permanent), but is a description of application behavior. Each time an object survives a run of the JVM’s garbage collector, its age
•
count is alarming because it indicates that as your application runs, your code is allocating new instances of that class without releasing all references it already has to objects of that class. Let’s look at two examples of applications that aren’t leaking. If your application creates three instances of class Foo when it starts, and continues to hold those references forever, the generation count for Foo would remain fixed at a value of 1 (see Figure 3), meaning that all object instances have the same age. This is because all instances of Foo have survived the same number of garbage collections; in this case, seven. Likewise, there is no leak if your application creates instances of Foo for usage only during very short periods of time and always lets go of its references to each Foo object. In this case, illustrated in Figure 4, the generation count would likewise remain low because Foo objects wouldn’t live very long on the heap. Again, the number in the oval is the object’s age, which is the number of times it has survived a garbage collection (in this case, zero). On the other hand, when the generation count is always going up—that is, there continues to be an increase in the number of different ages among instances of the same class—you’re probably leaking instances of that class. However, there is no “correct” value for generation count. The key factor to
Instrumentation can help identify which object instances are the most likely cause of memory leaks.
APRIL 2007
• increases by 1. Each object on the heap therefore has an age. The generation count for a class is simply the number of different ages across all instances of that class. Refer to Figures 3 through 5 for examples: Each oval represents an object, and each is annotated with its age. Each vertical line represents a run of the garbage collector. An increasing value for generation
www.stpmag.com •
27
JAVA STOP-LEAK
objects by the example program. The seven WorkDetails objects allocated by the HasMemoryLeak constructor (shown in the stack trace as <init>) all have the same age, so the generation count for those allocations is one. They are therefore probably not causing a problem. The relatively high value of the generation count for the WorkDetails objects allocated by the process Request() method indicates they probably are causing a memory leak. This information is especially helpful when working with an application with unfamiliar source code. You can use your inspection of the code that’s doing the allocations with high generation counts as the starting point to investigate why the objects it
FIG. 6: NETBEANS PROFILER
monitor is whether the generation count for a class stabilizes or continues to rise (as in Figure 5). Here, objects of the class are constantly being allocated without being released, so the generation-count value is always increasing; the number in the oval is the object’s age, which is the number of times it has survived a garbage collection. In the snapshot shown in Figure 5, generation count is 8 because there are eight different object ages. The value will continue to rise as more Foo objects are allocated. Figure 6 shows sample output from the NetBeans Profiler, which offers instrumentation as one option for doing memory profiling. The example program from Listing 1 was used to
FIG. 7: STACK TRACE
while those for the other classes remained relatively stable. If instances of a class appear to be causing memory leaks, the next step is to identify the specific instances that are leaking. In other words, some object instances of a class might be causing a memory leak, while others are not.
TABLE 2: HEAP INSPECTION VS. INSTRUMENTATION Heap Inspection
Instrumentation
Very little or no impact
Can be significant
Shows relationships between objects?
Yes
No
Identifies objects most likely to be memory leaks?
No
Yes
Scales with increase in heap size?
Can be a problem
Usually not an issue
Useful even if you aren’t familiar with the application's source?
Can be difficult to interpret
Yes
Impacts application performance?
generate the results, which are sorted by generation count (the far-right column). During the running of the application, the generation counts for the WorkDetails, int[ ] array and HashMap $Entry classes continued to climb
28
• Software Test & Performance
This is particularly true when the objects are of a commonly used class. The key to success is to find which allocations of those instances are causing the problem. Figure 7 shows two different stack traces for the allocation of WorkDetails
allocates aren’t being removed from the heap. More information on the NetBeans Profiler is available at its home page (http://profiler.netbeans .org/).
How to Choose a Technique For tracking down heap memory leaks, the techniques break down into two categories: heap inspection and instrumentation. Each approach has strengths and weaknesses (see Table 2). With heap inspection, you can see the relationships between the objects, but you have no information about which parts of the code caused those relationships. With instrumentation, you get information about the behavior of the code, but you lack a direct view of the relationships between the objects. In production environments, the runtime overhead imposed by instrumentation might be a problem, although heap inspection also has scalability problems as your application’s heap size increases. ý ACKNOWLEDGMENTS Many thanks to Alan Bateman and Frank Kieviet for their suggestions and advice on tracking down all sorts of memory leaks.
APRIL 2007
Qbsbtpgu!TPBuftu
UN
7FSJmFT 8FC TFSWJDFT JOUFSPQFSBCJMJUZ BOE TFDVSJUZ DPNQMJBODF 40"UFTU XBT BXBSEFE i#FTU 40" 5FTUJOH 5PPMw CZ 4ZT $PO .FEJB 3FBEFST
8FC 4FSWJDFT
"QQMJDBUJPO 4FSWFS
Qbsbtpgu!Kuftu
UN
7FSJmFT +BWB TFDVSJUZ BOE QFSGPSNBODF DPNQMJBODF +VEHFE *OGP8PSME T 5FDIOPMPHZ PG UIF :FBS QJDL GPS BVUPNBUFE +BWB VOJU UFTUJOH
%BUBCBTF 4FSWFS
*NQSPWJOH QSPEVDUJWJUZ DBO TPNFUJNFT CF B MJUUMF TLFUDIZ
"QQMJDBUJPO -PHJD
1SFTFOUBUJPO -BZFS
-FHBDZ
Qbsbtpgu!XfcLjoh
8FCTJUF
-FU 1BSBTPGU mMM JO UIF CMBOLT XJUI PVS 8FC QSPEVDUJWJUZ TVJUF 1BSBTPGU QSPEVDUT IBWF CFFO IFMQJOH TPGUXBSF EFWFMPQFST JNQSPWF QSPEVDUJWJUZ GPS PWFS ZFBST +UFTU 8FC,JOH BOE 40"UFTU XPSL UPHFUIFS UP HJWF ZPV B DPNQSFIFOTJWF MPPL BU UIF DPEF ZPV WF XSJUUFO TP ZPV DBO CF TVSF ZPV SF CVJMEJOH UP TQFD
5IJO $MJFOU
UN
7FSJmFT )5.- MJOLT BDDFTTJCJMJUZ BOE CSBOE VTBHF BOE NBOBHFT UIF LFZ BSFBT PG TFDVSJUZ BOE BOBMZTJT JO B TJOHMF JOUFHSBUFE UFTU TVJUF
UIF OFX DPEF EPFTO U CSFBL XPSLJOH QSPEVDU BOE BOZ QSPCMFNT DBO CF mYFE JNNFEJBUFMZ 8IJDI NFBOT ZPV MM CF XSJUJOH CFUUFS DPEF GBTUFS 4P NBLF 1BSBTPGU QBSU PG IPX ZPV XPSL UPEBZ "OE ESBX PO PVS FYQFSUJTF
(P UP XXX QBSBTPGU DPN 451NBH t 0S DBMM Y ª 1BSBTPGU $PSQPSBUJPO "MM PUIFS DPNQBOZ BOE PS QSPEVDU OBNFT NFOUJPOFE BSF USBEFNBSLT PG UIFJS SFTQFDUJWF PXOFST
Learning To Groove With The Gremlins How To Embrace The Inevitable:Chaos And Complexity By Dr. Linda J. Burrs
Y
ou rush to add the finishing touch to an eightweek project you’ve poured your heart and
soul into—when the server crashes and you can’t find your work. You need to retrieve it quickly because the senior executive expects it in an hour. What happened? Gremlins! Your best employee tells you she’s leaving the organization to work for your #1 competitor. She cites a lack of meaningful work and a less-than-appealing work environment. You thought she was happy. What happened? Gremlins! You wait until the last minute to leave for work because you’re watching an interesting news video. The story has special meaning, and you believe the concepts expressed in this news piece may help your team move beyond a sticking point. You plan to share this with your team at your 8:30 a.m. meeting. As you jump into your vehicle and speed away, you realize you have a flat tire. Why today, of all days? Gremlins! Most of us have heard of the butterfly effect. Can a butterfly flap its wings in China and create a hurricane in the U.S.? The question is asked not necessarily to be answered, but to demonstrate how small and seemingly insignificant changes in one area may have a major, even catastrophic impact in another place. In our linear way of thinking, if we knew or thought we could control or stop an unwanted event from happening, we would need only to capture that
30
• Software Test & Performance
butterfly and imprison it forever, or smash it and the problem would be solved. Well, would it? Most likely not.
On the Edge of Chaos We all live at the edge of chaos without recognizing it, so let’s explore this terrain a bit more. The baggage that surrounds the word chaos is so negative that we often miss the positive impact of a more nuanced understanding of the word. Yes, chaos typically refers to a state of confusion and uncontrollable experiences or events. But it can also be viewed as the natural disorder and unpredictability in our lives that may help us get what we want. The fact is, uncontrollable events are always at play in our lives. We can’t do enough controlling to make chaos go away without sacrificing creativity and innovation, ultimately destroying our ability to grow and evolve into the successful people we desire to be. Remember Gizmo, the adorable, fuzzy and exotic little creature from the movie “Gremlins.” In the gremlins’ world, there were only three rules: 1. Don’t get the Mogwai wet; 2. Don’t put the Mogwai in bright light; and 3. Don’t feed the Mogwai after midnight. Nothing mind-blowing, earth-shattering or complex. These simple rules should have been simple to follow. Yet, in spite of best efforts, water was accidentally spilled on Gizmo (the Mogwai’s new name), and the Dr. Linda J. Burrs is the founder of Step Up To Success, a management training consultancy based in Dayton, OH.
APRIL 2007
APRIL 2007
www.stpmag.com â&#x20AC;˘
31
Illustration from Inmagine.com
COPING WITH CHAOS
impending chaos wreaked havoc on the entire town. Most of us have firsthand experience from encounters with gremlins in the many areas of our lives. This is not news. What is news is that by drawing from the gremlin experience, we can find order, growth and help in accomplishing our goals. So how can we make troublesome, uncontrollable events work in our favor? We’ve all been conditioned to make every effort to control what is uncontrollable. We like to believe that
blood, organs, etc. (parts). If I put enough controls in place, I may contain all the parts and make them work the way I want. If I eat the right foods, exercise daily and get enough sleep, I should be able to control my health and stay well. But alas, that’s not always the case. In the nonlinear system (complex adaptive system), the whole is greater than the sum of its parts. This means that the parts each have their own multidirectional, complex system that’s constantly supporting, feeding and balancing the whole with unanticipated, unexpected and unpredictable outcomes—which are most often uncontrollable. To deal effectively with these complex relationships that compound our existence, we need help.
In spite of our best efforts, we often find chaos running just beneath the surface, outside our field of vision.
Learn to Live With The Gremlins
if there are enough rules and regulations, we can eliminate ambiguity and chaotic unpredictability. But just as in the Gremlins movie, in spite of our best efforts to follow the rules, we often find chaos running just beneath the surface, outside our field of vision. One way to explain how gremlins get and stay in the system is through a simple understanding of complexity. Complexity science helps us understand why chaos and unpredictability exist, and how they may help us.
The World of Linear And Nonlinear Systems Complexity can be simply explained as a whole containing many parts in a relationship in which the parts are unpredictable and dynamic. In a linear system—and we like to believe we live in one—the whole is simply the sum of all the parts. I’m a human being (whole) made up bone, tissue,
32
• Software Test & Performance
•
A study of complexity helps us recognize that systems nested inside other systems are unpredictable and at times seem unmanageable. Complexity burns energy, and you’ll burn yourself to a crisp trying to control the gremlins. Instead, learn how to expect the unexpected. Today, a friend called because she dropped her car key last night in a pile of snow but couldn’t find it. When she looked for the key this morning, she realized that the snowplow had piled the snow even higher over the place she dropped the key. She decided to have the car dealership make her another key using her VIN number. It was bitter cold outdoors, with a wind chill at -3 degrees. She tried to walk but got too cold. By the time she called me, she was in a bit of a quandary. Clearly, gremlins were loose in the system, and they proliferated rapidly. I picked her up and took her to the dealership, only to find out they couldn’t make the computerized key. They recommended another dealership, but the gremlins foiled us again. Finally, we used our own smarts to hunt down the key at a third location. The secret to managing this type of constantly changing complexity? Expect the unexpected by formulating a Plan B at the same time that you come up with Plan A. This way, you aren’t derailed when the gremlins pop up again.
We can’t control chaos, so we have to learn to live with the gremlins. Every choice we make leads to complex behaviors that influence outcomes, none of which can be predicted or controlled. But we think we’re more in control than we actually are, so when gremlins show up in the system, we’re thrown off balance. So servers crash, good employees leave, and tires go flat at the most inopportune times. What can be done about these gremlins? Indeed, reality is a moving target, and uncertainty is a fact of life. When we don’t accept this fact, gremlins can wreak havoc in our lives and we feel out of control and stressed. However, when we understand and acknowledge the fact that complex systems reside within every interaction and every relationship, we’re less stressed and more capable. How do we get there?
Stick to Your Theories, But Expect The Unexpected Relax and learn to share your space with gremlins. You may also want to take some time to understand and embrace complexity and chaos theories.
•
Focus on Present Possibilities Prepare for the future by focusing on and staying in the present. According to a colleague, focusing on present possibilities helps keep us sane. He suggests that when we aren’t focused on the here and now, we miss opportunities and experiences that may help us navigate the waves of chaos we encounter everyday. For anyone who has ever witnessed rush hour in Grand Central Station, the need to stay
COPING WITH CHAOS
focused on the present may be understood all too well. Visitors to the city who aren’t familiar with the amazing apparent chaos of systems embedded into more and more systems may be overwhelmed at the sight. In spite of apparent chaos, just beneath the surface lies a structure that brings order to the seeming madness. Focusing on only the complexity of the matrix blinds you to opportunities to enjoy the energy and excitement of one of this nation’s most energized cities. Stay focused on what needs to get done now and know that even then, you should be ready for gremlins.
Recognize Emotion Before Making a Decision You make choices every day without much thought to the outcomes. When you experience stress or get the feeling something needs to change in your life, you should stop, look and listen for gremlins. Once you acknowledge that you can’t control the uncontrollable (gremlins), you can be at peace with yourself and others. And when the unexpected does show up, you can allow yourself time to acknowledge what you’re feeling before you decide what to do next.
Learn from your fears. Allow yourself the luxury of acknowledging your feelings. Learning to live with chaos means thinking creatively and instinctively. Pay attention to those instincts and use them to inform your decisions.
Perspective Is Everything Sometimes change is hard. For years we’ve been told that we’re resistant to change. For a while, I felt out of place because I didn’t feel as if I were resisting change. In fact, I liked some change, and liked it a lot. Complexity helps me understand why I’m not resistant to change—and why you most likely aren’t, either. Perhaps you’d done everything you could to retain that great employee, but she chose a job closer to home. What we sometimes forget is that change activates possibilities in ways we could never have imagined if we allow the gremlins to have their way. I believe that instead of trying to control gremlins, we need to control our linear, straight-line thinking to more flexibly manage change and eliminate problems that keep us stressed and confused. Let’s look at what complexity teaches us. Complexity takes us into the realm of chaos and attractors. From a social science perspective, for many of us, personal values, beliefs, assumptions and culture are our attractors; the elements that push and pull us in ways that are sometimes surprising. When we examine this process closely, we gain a different perspective of our behavior and what attracts us to respond as we do. When we give up our need to control our environment based on what attracts us, the gremlins can’t upset the apple cart as much as they used to, before we learned to let go.
•
As long as there’s some chaos, we know we’re alive. Perhaps we’ve been overly conditioned to fight and resist the work of chaos and complexity—no matter how hard we resist, chaos will prevail. Cells are dying, and new ones are taking their place. Hairs are falling out of our heads; sometimes they come back, and sometimes they don’t. As long as we’re subject to negative conditioning toward chaos and complexity, we’ll remain stressed and fearful. When we try to over-control or over-contain the work of the gremlins in our systems, we persistently miss opportunities for greater learning, greater personal growth and greater influence. Instead of corralling them, follow the rule of complexity and learn to coexist with the gremlins in your systems. You may gain some unique insights from embracing change instead of resisting it. The gremlins will always be there, forcing you to adjust your coping mechanisms. You can either spend your life fighting them, or simply adapt to their existence and plan accordingly. ý
Learn to coexist with the gremlins in your systems. You may gain unique insights from embracing change instead of resisting it.
Don’t Contain Complexity We all share our lives with the gremlins of chaos, dissonance and complexity.
•
www.stpmag.com •
33
Perforce
Fast Software Configuration Management
Introducing Time-lapse View, a productivity feature of Perforce SCM. Time-lapse View lets developers see every edit ever made to a file in a dynamic, annotated display. At long last, developers can quickly find answers to questions such as: ‘Who wrote this code, and when?’ and ‘What content got changed, and why?’ Time-lapse View features a graphical timeline that visually recreates the evolution of a file, change by change, in one fluid display. Color gradations mark the aging of file contents, and the display’s timeline can be configured to show changes by revision number, date, or Perforce Time-lapse View
changeset number. Time-lapse View is just one of the many productivity tools that come with the Perforce SCM System.
Download a free copy of Perforce, no questions asked, from www.perforce.com. Free technical support is available throughout your evaluation.
Photograph by Joann Snover
Can You Find the Bugs In Your GUI Application? By Dan Rubel and Phil Quitslund
G
raphical user interfaces today are among the most complex and interactive software tools designed. Customers expect the same
high quality in a GUI that they get from simpler software tools. When users press a key, click the mouse or select menus, they expect consistent results. Tossing a GUI-based application over the wall for testing or QA is no longer an option. To build software faster and still meet customer expectations for high quality, companies must discover any bugs before Dan Rubel is CTO and Phil Quitslund is a senior architect at Instantiations. APRIL 2007
their customers do. Not only is this more cost effective, it also fosters greater customer satisfaction. To create high-quality user interfaces, testing means checking the UI throughout the development process. This includes exhaustively exercising the user interface.
Manual Testing Manual GUI testing is time consuming, labor intensive and expensive. Based on written directions explaining the steps and expected results, it
requires a labor force of either the QA department or an outsourced team. It requires writing a script, the use case or other set of detailed instructions that describe user actions and their expected outcomes. This process requires its practitioners to follow directions explicitly, uncover and record any problems, and return the results to the development team. Since itâ&#x20AC;&#x2122;s human powered, the process is tedious, error prone and costly. Because of the cost, manual testing is usually performed not continuously throughout the development cycle, but just before the software is released. www.stpmag.com â&#x20AC;˘
35
GUI DEBUG
TABLE 1: ON THE MENU Tool
Availability
Type
Abbot
Open source
Library (Java)
Costello
Open source
Recorder (XML)
GUIdancer
Commercial
Recorder
JDemo
Open source
Library (Java)
Mercury QuickTest Professional
Commercial
Hybrid (VB Script)
QF-Test
Commercial
Recorder (XML/Jython)
Rational Functional Tester
Commercial
Hybrid (Java/.NET)
TPTP AGR*
Open source
Recorder (XML/Java)
Window Tester
Commercial
Hybrid (Java)
* TPTP AGR = Eclipse Test and Performance Tools Platform Automated GUI Recorder
Automated Testing Automated testing employs a “test” program to drive a set of inputs to exercise some part of software code to see if it delivers a desired result. Automated testing can be split into two broad categories: API and GUI. Both involve driving the application and verifying results. Automated API testing, the more common technique, calls methods and validates return values. The more difficult method, automated GUI testing interacts with the application as a user would—through the user interface. Both approaches are necessary for complete testing of an application. Testers should seek to validate the UI early in the process, and then frequently throughout the development cycle to ensure that new features and fixes don’t introduce new bugs. To achieve this, development must be integrated with quality assurance testing. Because it’s machine powered, automated GUI testing executes faster and can be performed more frequently than manual testing. However, not all tests can be automated. The best practice is to automate tests that can be automated, execute them during an automated build process, and use manual tests for UI interactions that are too difficult or impossible to automate. By combining the two approaches, you can achieve the best quality at the lowest cost.
GUI Testing Approaches Writing a comprehensive test procedure is tedious, and the process of automating and maintaining that procedure is difficult. Numerous tools exist to help with this, and can reduce automation time and cost considerably (See Table 1).
36
• Software Test & Performance
Automation tools offer varied approaches. Some tools simply record use cases and offer no access to the source code run during playback; others offer source code libraries used in hand-coded automated UI tests; and others take a hybrid approach, generating source code as a starting point and facilitating hand-modification of recorded tests. Each approach has its strengths and weaknesses. Easy button. Click-and-record UI testing tools generate test scripts by watching a user interact with an application under test. Typically, the user launches the tool and the application, and then starts the UI test recorder. As the user interacts with the application, the UI test recorder gathers all of the UI events and captures state information about the visible UI components, such as whether a button is enabled, or whether a check box or text in a particular field is selected. The test tool then emits a script that can be used to replay the recorded actions. Some tools permit the scripts to be parameterized later with different inputs for the same test, allowing a single test to cover multiple situations. Recorded actions are typically stored as XML-based scripts or Java use cases. Click-and-record products can be used by non-programmers. Less easy button. Products that provide a UI testing library with which UI tests can be hand-coded obviously require knowledge of a programming language. The initial tests are written to call the API provided by the testing library. Over time, as the number of tests grows, common elements of the tests being written are factored out into utility classes and methods, making the creation of subsequent tests easier. Although UI testing
libraries take more time to understand and code tests initially, they provide a more flexible long-term solution. Best of both button. Hybrid tools offer the most options. They have functionality to record the user’s interactions with the application under test and can generate test scripts. But rather than storing the scripts in a descriptive format such as XML, hybrid products create test suites in a general-purpose programming language on top of a testing library. Hybrid products can leverage non-technical people to generate the initial tests, the scripts for which can be edited, stored and reused. The downside to this and other methods of automated software testing is interpretation of recording results. Considerable effort is required to read the code generated and understand what it is doing. Regardless of the tools you choose to create and exercise your GUI tests, challenges include finding widgets, verifying state (assertions), staying in sync, handling the unpredictable, adapting to platform differences and maintaining tests.
Finding Widgets For each UI operation such as clicking a button or selecting a menu item, testers must locate the UI component to be manipulated. Problems of ambiguity can arise when attempting to identify individual widgets. For example, there can be identical items in menus or trees, or multiple labels or buttons with the same text. One remedial practice involves inserting code to associate a unique internal identifier with each widget so that it can be uniquely identified when testing. Another remedy involves locating a widget based on its container hierarchy. Also practiced is location by screen position, but this fragile method should be used only as a last resort. Different tools solve this problem in different ways. The Automated GUI Recorder, part of the Eclipse Test and Performance Tools Platform (collectively the TPTP ARG), stores test scripts in XML and identifies the widget using a weighted widget identifier. The widget identifier is stored as a command element in the test script and might look something like this: <command descriptive=”Transfer Shell Menu Action” type=”select”
APRIL 2007
GUI DEBUG contextId=”menus” widgetId=”org.eclipse.ui.internal.ActionSetContribut ionItem # { { Tra n s f er- & a m p ; Tra n s f er } } - { { 0 . 8 } } { { 5 | 0 } } {{0.6}}{{true}} -{{0.1}}{{1}}-{{0.1}}{{first}}-{{0.2}}{{last}}-{{0.2}}” />
In contrast, Abbot, an open-source Java GUI testing framework, offers a programmatic solution to this problem by providing matchers used to identify the desired widget from within a container hierarchy. Locating the first widget of type “Tree” in a container hierarchy might look something like this: tree = finder.find(new ClassMatcher(Tree.class), 1));
IndexMatcher(new
Verifying State (Assertions) Before and after a UI operation, testers should verify that aspects of the GUI are exactly as expected. Implicit verification occurs when a widget is located to be manipulated; if the widget can’t be located, the test fails. Not all UI state is interesting, but UI state that changes in one widget as a result of a UI operation in another should be explicitly verified as part of the test. For example, verifying that the content of a tree has been updated as a result of clicking a button is interesting, while verifying that a check box has changed state as a result of clicking it is typically not interesting. Tools that provide a UI-testing API in a general-purpose program language offer maximum flexibility in this regard. Any widget can be accessed, and any application state can be verified. For example, in Abbot, verifying that a tree has a particular selection might look something like this: selection = treeTester.getSelection(tree); assertEquals(1, selection.length); assertEquals(“report.pdf”, treeTester.getText(selection[0]));
Click-and-record tools have different levels of possible verification depending on their design. Some provide a fixed set of verification operations, while others provide the ability to call a class or method to perform verification programatically.
focus on what should be tested and not on how it’s accomplished. For example, a UI operation that involves a transaction over the network may take an unpredictable amount of time. One way to account for the unpredictable duration is to delay the test some fixed amount of time and hope that the transaction has completed before the test completes. A better and less fragile way is to set up a wait condition that detects when the transaction is complete and continues executing the test. Some click-and-record tools such as TPTP AGR hide all of the UI test playback behind a playback UI. The price of this simplicity is the framework’s inability to deal with unpredictable timing. With hybrid solutions, playback is handled through an API that hides the complexity and provides the ability to deal with unpredictable timing. Pure library API solutions such as Abbot also provide that flexibility, but at a price; playback issues such as threading must be tackled manually. For example, waiting for a progress dialog to close—indicating that an operation has completed—might look something like this: Robot.wait(new Condition() { boolean inProgress = true; public boolean test() { Display.getDefault().syncExec(new Runnable() { public void run() { Shell shell = Display.getDefault().getActiveShell(); if (shell == null || !shell.getText().equals(“Progress Information”)) inProgress = false; } }); return !inProgress; } });
Handling the Unpredictable The outcome of UI operations is hard to predict—and not all unpredictable outcomes are error conditions. For example, a dialog may appear the first time a UI operation is performed but not again during that session. While this situation can be coded into the first test executed, that approach is fragile. It’s better to set up a condition handler that watches for this dialog to appear and responds accordingly. Test libraries and hybrid solutions provide the greatest flexibility when dealing with the unpredictable; these tools offer a UI-testing API in a generalpurpose programming language to programmatically react to specific unpredictable situations. For example, given Abbot and the unpredictable dialog situation described above, you could fork a thread (see Figure 1) to wait for the appearance of a dialog with a particular title, and when that dialog appears, click the button to dismiss the dialog and resume the test. Coding such an action would look something like this: final Thread handleWarningThread = new Thread(“Handle Warning Thread”) { final Object lock = new Object(); public void run() { while (true) { Display.getDefault().asyncExec(new Runnable() { public void run() { lock.notifyAll(); Shell shell = Display.getDefault().getActiveShell(); if (shell != null && shell.getText().equals(“Access Error”)) { Button button = (Button) finder.find( new ClassMatcher(Button.class));
FIG. 1: HEY, ABBOT!
Fork
Fork
Test Thread
SWT Thread
Condition Thread
SWT Event Staying in Sync Keeping the test synchronized with the UI can be complicated. Typically the test executes in a separate thread from the UI. A good test framework hides this issue from the developer so he can APRIL 2007
SWT Event
Abbot Library
www.stpmag.com •
37
GUI DEBUG new ButtonTester().actionClick(button); } } }); lock.wait(); Thread.sleep(500); } } }; handleWarningThread.setPriority(Thread.MAX_PRIORITY); handleWarningThread.setDaemon(true); handleWarningThread.start();
Given a hybrid tool and this same situation, the solution might be a single call to notify the runtime engine and handle the unpredictable dialog. This same technique could be used with click-and-record tools as long as they provide a call-out hook to a programming language.
Platform Differences Different operating platforms have different accelerator key labels (Win32’s is Alt+Shift+N, while Linux uses Shift +Alt+N), default selection events, focus differences and tree item selection. Ideally, tests should not include platform-specific references so as to allow the same tests to exercise and validate the same application written for different platforms.
How platform differences are handled depends very much on the tool being used. Some click-and-record tools such as TPTP AGR provide data parameterization that may be used to smooth out some differences between platforms. Others allow for regular expressions to be used as parameters in the test script. And, of course, tools that offer a UI-testing API give you the full power of that language.
Maintaining Tests As the GUI changes during the lifetime of an application, adapting the test can be tricky. Something as simple as changing a label from disconnect to terminate requires a change in the test details. Developers must make changes to GUI tests as the interface evolves. If the test script is readable, it makes test maintenance much easier. The readability of the test script depends on the chosen tool and the effort expended during test creation to structure the tests. With click-andrecord tools, some changes in data inputs are easy to make using that tool’s user interface; other changes require modifications to the tool’s test
script that might be more difficult. In the face of a dramatic UI change, rerecording is always an option and sometimes is the easiest approach. For tools that provide a UI testing API, test maintenance is dependent on the test library API and how the tests were created and refactored over time. An automated GUI-testing tool must be easy to use to encourage adoption while providing the flexibility to react to unforeseen situations and the maintainability to adapt to UI modifications over time. With the time between an application concept and deployment shrinking, as much of the testing process as possible must be automated—and automated GUI testing is an important piece of that process. Many tools offer the ability to automate GUI functional and regression testing by recording GUI interactions. Most also allow editing and reuse of those recordings, usually through proprietary scripting languages or XML. Only a few, however, generate recordings in a general-purpose language. So if that’s what your organization requires, be sure to read the fine print. ý
Last Chance To Register!
www.S-3con.com
Best Prac t ices
An Automated Test Postcard From Cairo Toiling as a tech drone in user acceptance.” one of the countless North The regression testing American cubicle farms, it that was part of every build can be tempting to think was like groves of date palms that life must be better in the sweltering Middle somewhere else. OverEast summer and early fall, zealous vendors, endlessly when the trees are loaded incompatible code, unreawith the sweet fruits—that sonable management exis, seemingly ripe for harpectations… sometimes it’s vest. And the promise of enough to make you ponder capturing and consistently Geoff Koch chucking it all for some reusing knowledge of anamoderately exotic locale. lysts and testers seemed likely, Zeid says, How about Egypt? After all, you’ve to be “one of the biggest advantages” always wanted to see the Sphinx and pyrassociated with the project. amids before you die. Heck, if you like it But unfortunately, things start to get and want to extend your stay, you can dicey here, in ways depressingly familiar even pick up some nuts-and-bolts proto those involved in testing everywhere. gramming work doing something like, Scripts Don’t Write Themselves say, test automation. Maybe the tech life First is the issue of assigning a veteran in a slightly less mature market is—dare product analyst and tester to the automayou think it—a little more fun? tion team on a quasi-permanent basis, a If you’re enjoying the fantasy, stop shift in resources and headcount that reading and start Googling those Middle resource-strapped managers everywhere, East travel Web sites. The word from the including ITSoft, can find to be prohibimouth of the Nile River is that headaches tively expensive. After all, automation is associated with automated testing are the supposed to be labor-saving, especially for same there as anywhere. the highest-skilled workers, right? Meet Cairo-based Sameh Zeid, a senPerhaps, but absent this support, Zeid ior consultant with ITSoft, which prosays his team is left with the feeling that vides software and services to banks “we cannot trust the existing documentthroughout the Gulf region. The comed test cases. Unless we are able to autopany, he says, maintains a monolithic mate what the veteran testers and analysts code base underpinning its entire prodwould do, we will end up with automation uct line. And it’s fair to assume from his that is unreliable. comments that shying away from cusNext is a related problem that’s entiretomization makes life easier in terms of ly cultural. We’re not talking East versus configuration management and reviWest here— rather, coders and software sion control, but harder when it comes engineers versus the testing and QA to quickly addressing new opportunities crowd. In Cairo as in California, developor fixing old bugs. ers too often consider testing, even the “We needed to shorten the release moderately sophisticated task of generatcycle time by reducing the schedule ing and maintaining test automation and effort of system testing,” Zeid says, scripts, to be second-class work. explaining ITSoft’s current testThen there are the potentially thorny automation project, which involves and expensive technical issues, familiar to IBM’s Rational Robot. “The testing was anyone who has tried to look beyond the incomplete and causing havoc during
40
• Software Test & Performance
unwaveringly optimistic marketing promises proffered by tool vendors. ITSoft took advantage of what Zeid said was IBM’s special subsidized pricing for Egyptian companies to purchase what’s arguably one of the better-known automation tools in the market. Rational Robot is compatible with Centura 1.5, the semi-obscure language in which ITSoft’s application is written. But Zeid worries that very few IBM consultants have experience working with Centura, especially since it’s possible that, sooner or later, he’ll need those consultants’ help. Finally comes the time-honored uneasiness about test automation’s return on investment. The number is calculated based on n number of releases using the tool, which may or may not happen, Zeid says. And he has additional squeamishness about maintenance efforts associated with test scripts, especially given the inevitable additions and changes from one release to another. Zeid’s laundry list of test automation concerns would sound familiar to the likes of Paul Grogan, a developer with Los Angeles–based CKE Restaurants, parent company of fastfood companies such as Carl’s Jr. and Hardee’s. Grogan, a developer whose background includes Assembler, C, C++ and Java, supports corporate applications used by 120 company users. Making money in fast food means more than foisting ever-larger servings on unsuspecting customers. These companies, CKE included, often find themselves with vast, disjointed real estate portfolios, management of which means using software to handle myriad leasing and profit-sharing agreements. Best Practices columnist Geoff Koch welcomes tales of programming triumphs and travails from outside the United States, particularly in emerging markets. Write to him at koch.geoff@gmail.com. APRIL 2007
Best Practices Grogan recently employed test automation in tweaking a COTS real estate–management application used by CKE. Part of his success was finding the right tool—his firm chose a product from Worksoft—that could handle testing even when the UI was a constantly moving target. “Users are more savvy and less likely just to accept what developers hack together as far as a UI,” said Grogan. “Since today, less is happening on the middleware and more on the screen, there are more opportunities for traditional auto-test tools to break.” Grogan’s approach to the project reveals attempts to solve several of the problems outlined by Zeid. To ensure quality, Grogan insisted that subject-matter experts, not QA staff, generate the test cases recorded by the automated tool. Additionally, Grogan realized that his automated test system would fulfill its promise to make it cheaper to fix bugs and faster to incorporate feedback only if it was fed properly by the development
team. So the testing team received a series of bite-sized, modular releases—27 in all during the three-month project— rather than being forced to digest one super-sized chunk of code when the first round of programming was done.
Better Tech and People Skills Bill Hayduk, founder and president of New York City–based RTTS, a testingfocused professional services organization, has seen firsthand the challenges that crop up when mostly nontechnical QA and test organizations deal with increasingly sophisticated testing regimes. Like Zeid, Hayduk insists that any automated testing project, which inevitably includes managing a growing and changing portfolio of test scripts, must be treated like a full-fledged, bona fide development project. Not that technical skills alone are enough. Hayduk says that all of his new programmers take 325 hours of training before being loosed on customer projects, which often involve test automation.
“The 325 hours are needed to teach graduates with programming skills how to best leverage these test tools, along with test methodology, project management concepts, software architecture and development of soft skills,” Hayduk adds. One of these required soft skills, of course, is managing manager expectations. Reading that sunny marketing material, bosses may eagerly anticipate trimming the testing bill and shipping product faster, only to be disappointed with the final results. Zeid says cryptically that this cropped up during his project and was addressed by way of ominous-sounding “awareness sessions,” which bring to mind teethgrinding hours of meetings with bean counters who don’t know the difference between a test script and a movie script. So if you want to head to Cairo to see one of the Seven Wonders of the World, by all means start planning. Bring your camera, but be sure to leave any expectations of a frustration-free coding paradise at home. ý
Index to Advertisers Advertiser
URL
Page
Bredex
www.bredexsw.com
8
Compuware
www.compuware.com/performancerequirements
Empirix
www.empirix.com/freedom
3
Gomez
www.gomez.com
2
Hewlett-Packard
www.optimizetheoutcome.com
44
JavaOne Conference
java.sun.com/javaone
39
Klocwork
www.klocwork.com
43
Parasoft
www.parasoft.com/stpmag
29
Perforce
www.perforce.com
34
Seapine
www.seapine.com/stptcmr
Software Security Summit
www.S-3con.com
38
Software Test &
www.stpcon.com
6
www.stpmag.com
41
www.totalviewtech.com
17
21
4
Performance Conference Software Test & Performance White Papers TotalView Technologies (formerly Etnus)
APRIL 2007
www.stpmag.com •
41
Future Future Test
Test
Code Blue! How do you rank the severity of applithings to the U.S. government. No, not cation defects? Some test teams assign the actual grading of your software’s severity/priority scores, but that’s arbibugs, you silly person: that’s your job, trary, and doesn’t reflect the realand you can’t get out of it. No, we world impact of bugs. How can you should take a leaf from how our benevreally assess the importance of someolent authorities have responded to thing that’s rated “medium” for severiairport security. You don’t hear the ty but “low” for priority? public address system at the airport More practical dev say, “We’re at terror level, teams use expressions to we’re-all-going-to-die!” communicate, through That would cause panic. the defect database, Worse, it’s ambiguous, change management syssince it doesn’t communitem, e-mail or sticky cate who is going to die, notes, how important a when this will take place, defect is to the team. and if you have time to “This one’s a showstopbuy a $5 Bloody Mary per,” you might say. Or “If from a flight attendant you can fix this before the beforehand. I.B. Phoolen next release, that would Instead, as I’m sure you be great,” you might comment. Or know, the monotone announcement “Who cares about a teeny-weeny typo?” on the P.A. system says something like, you might write. Or “Sheesh, this one’s “Attention: We are at Homeland definitely gonna get us sued,” you Security Threat Condition Orange.” might opine. Isn’t that better than This is more practical, and more use“high,” “medium” and “low”? ful, because it’s from the government—and they know best. It’s All Relative That, my friends, is the model that However creative that approach is, the software test/QA professionals should expression-based defect ranking sysuse when communicating with end tem does leave things to interpretausers and other stakeholders about tion. One person’s “it’s a teeny-weeny bugs, when assessing the bugs for ourtypo” is someone else’s “clean out your selves, and when classifying said bugs desk and be out of here before the in the defect database. cops come,” especially if that typo was “Hey, Bob, looks like we’ve got a in your CEO’s name, or in one of the nice, juicy Yellow here,” you might digits in your upcoming Securities and hear shouted over a cubicle. “Are you Exchange Commission filing. sure it’s not Orange? We’re fixing only Similarly, while your CFO might Oranges before the next beta,” you issue a scathing four-letter expletive in might shout back. And so on, and both cases, which of the following is so forth. worse: a bug that applies the wrong The U.S. government’s colorful algorithms to stock-options pricing, or Homeland Security Advisory System a bug that applies the wrong algo(HSAS) was enacted in March 2002. rithms to a credit-scoring system? Forget about those silly “low,” “mediSelecting the “we’re totally screwed” um” and “high” scales that you see button in the issue-defect system’s so often in defect-management sysseverity/priority may not accurately tems: The HSAS goes much farther, communicate the CFO’s displeasure. with five levels: • Red = Severe: Severe Risk of TerThe right solution, as I’m sure you rorist Attacks will agree, is to hand these sorts of
42
• Software Test & Performance
• Orange = High: High Risk of Terrorist Attacks • Yellow = Elevated: Significant Risk of Terrorist Attacks • Blue = Guarded: General Risk of Terrorist Attacks • Green = Low: Low Risk of Terrorist Attacks Brilliant, brilliant, I can hear you saying. Go ahead, say it again: Brilliant. Thanks. You can see instantly why this is appropriate for software development and test/QA. I would humbly propose the following scale for categorizing software threats. Actually, I’d like to propose two scales, which I call the Defect-Advisory Software System (D-ASS). The first D-ASS scale is the one that you tell your end users, managers and other stakeholders about, and which they use for reporting bugs to your test team: • Red = Severe: This Must Be Fixed Immediately • Orange = High: This Should Be Fixed Soon • Yellow = Elevated: Fix This When You Can • Blue = Guarded: Fix This Sometime, Maybe • Green = Low: Just Thought You Should Know The other D-ASS scale, of course, is more important, because it’s the one you use to categorize actionable issues in the defect database: • Red = Severe: This Will Definitely Get You Fired • Orange = High: This Will Probably Get You Fired • Yellow = Elevated: This Might Get You Fired • Blue = Guarded: This Probably Won’t Get You Fired • Green = Low: This Is Just Stupid Follow this system, my friends, and you’ll never get scolded for misspelling the CEO’s name or miscalculating stock option prices, ever, ever again. And you’ll surely find yourself using some form of the popular phrase “I fixed that bug just to cover D-ASS.” ý Retired test engineer I.B. Phoolen believes that Qaulity is Job #1. Write him at ibphoolen@gmail.com. APRIL 2007
HP software is turning I.T. on its head, by pairing sixteen years of Mercury’s QA experience with the full range of HP solutions and support. Now you can have the best of both worlds. Learn more at OptimizeTheOutcome.com
©2006 Hewlett-Packard Development Company, L.P.
There’s a new way to look at QA software.