Publication
: se ST ES ip BE CTIC Ecl A th PR g Wi tin
Tes
A
VOLUME 4 • ISSUE 8 • AUGUST 2007 • $8.95 • www.stpmag.com
A Tail of Two Test Teams Dogged Java EE Unit Testing
Bark! Bark! Bark! Protect User Roles During Security Testing
Training Your Unit-Test Team Takes Time, Patience, Treats
October 2– 4, 2007 • The Hyatt Regency
s t e r c e S g n i t Tes
! D E L A E V RE ION! S R E M M I TOTAL + Classes
rom 70 ls Choose F pth Tutoria e -D In 8 ns Pick From mo Sessio e D l o o T Hands-on olleagues C h it W k r Netwo Social Ice Cream Hall nstration o m e D in rs Reception + Exhibito 5 2 to s n r Questio Speakers 0 4 Pose You n a h T ith More Mingle W
KERS! A E P S B R SUPE ex Black Barber • R
y ch • Scott ice Chane e James Ba n ly C • n olto alen Michael B • Robert G in te s ld e F es Jeff Linda Hay • h it m s ld ney Robin Go ary Swee M • in r u bo re! Robert Sa ozens mo d d n a h als Robert W
register today at
! t n u o c s i D d r i B y l 00 r 2 $ a E E V e A h S Get t r by Sept. 14 and
Cambridge • Boston, MA
Registe
OPICS! T C I F I R R TE Teams
Test/QA s Managing b Service e W d n a OA Testing S va Testing a J d n a # C ting Agile Tes ks Bottlenec e c n a m r o f Per Locating e Testing Just-in-Tim trics Me Effective ering ents Gath m e ir u q e R rformance e P a v a J Improving Strategies g in t s e T d Risk-Base esting Security T
www.stpcon.com
about nt to learn “If you wa ormst and perf te t s te la the m the niques fro ance tech ant rts, you w e p x e y tr s indu TPCon!” to attend S tions olu Marquiz, S —Michael s isco System Architect, C
The days of
‘Play with it until it breaks’ are over!
Learn how to thoroughly test your applications in less time. Read our newest white paper: “All-pairs Testing and TestTrack TCM.” Download it today at www.seapine.com/allpairs4
Seapine ®
TestTrack TCM
Software for test case planning, execution, and tracking You can’t ship with confidence if you don’t have the tools in place to document, repeat, and quantify your testing effort. TestTrack TCM can help you thoroughly test your applications in less time. Seapine ALM Solutions: TestTrack Pro Issue & Defect Management
TestTrack TCM Test Case Planning & Tracking
Surround SCM Configuration Management
QA Wizard Pro
!UTOMATED &UNCTIONAL 4ESTING
In TestTrack TCM you have the tool you need to write and manage thousands of test cases, select sets of tests to run against builds, and process the pass/fail results using your development workflow. With TestTrack TCM driving your QA process, you’ll know what has been tested, what hasn't, and how much effort remains to ship a quality product. Deliver with the confidence only achieved with a well-planned testing effort.
s %NSURE ALL STEPS ARE EXECUTED AND IN THE SAME order, for more consistent testing. s +NOW INSTANTLY WHICH TEST CASES HAVE BEEN EXECUTED WHAT YOUR COVERAGE IS AND HOW MUCH testing remains. s 4RACK TEST CASE EXECUTION TIMES TO CALCULATE HOW much time is required to test your applications. s 3TREAMLINE THE 1! &IX 2E TEST CYCLE BY pushing test failures immediately into the defect management workflow. s #OST EFFECTIVELY IMPLEMENT AN AUDITABLE QUALITY assurance program.
Download your fully functional evaluation software now at www.seapine.com/stptcm or call 1-888-683-6456. ©2007 Seapine Software, Inc. Seapine, the Seapine logo and TestTrack TCM are trademarks of Seapine Software, Inc. All Rights Reserved.
VOLUME 4 • ISSUE 8 • AUGUST 2007
Contents
A
Publication
12 C OV E R S T ORY
18
A Tale of Two Test Teams
Both teams work with similar tools, projects and automation levels—so what’s the reason for their different success rates? It’s simple: a feedback system they By Jeff Nielsen can count on.
Sit! How To Train Your Team For Unit Testing When it comes to adoption of new testing methods, it takes patience, planning and pacing. Help your team learn how to sit up and surf the change curve from chaos to success. By Jeffrey Fredrick
Depar t ments 7 • Editorial “Field of Dreams” got it right—about both baseball and software.
26
Expand Your Testing Palette
With proper setup and deployment of JUnit tests, many Java EE frameworks can be well tested. But when new test harnesses are involved, you need more than JUnit to identify the means and breadth of testing. Expand your test palette with this four-step process to make your unit testing a more vibrant experience. By Matt Love
AUGUST 2007
8 • Contributors Get to know this month’s experts and the best practices they preach.
9 • Feedback Now it’s your chance to tell us where to go.
The Security Zone Playing the Part Of Protector
33
Beef up security! Organize, define and automate user rule-based security testing for your enterprise applications, and your company won’t have to call By Linda Hayes in the gladiators.
10 • Out of the Box New products for developers and testers.
36 • Best Practices Eclipse turns 7—happy birthday to us! How a community coalesces. By Geoff Koch
38 • Future Test Correlate app components to the business processes that use them. By Jason Donahue
www.stpmag.com •
5
Trying to be agile when your Java code is fragile? Feeling the pressure to release software faster? Are you bringing new features to market as quickly as your business demands? If enhancing or extending your Java application feels risky – if you need to be agile, but instead find yourself hanging on by a thread – AgitarOne can help.
With AgitarOne’s powerful, automated unit testing features, you can create a safety net of tests that detect changes, so you know instantly when a new feature breaks something. Now you can enhance and extend your Java applications – fast, and without fear. And with AgitarOne’s interactive capabilities for exploratory testing, it’s easy to test your code as you write it. Whether you’re building a new application, adding new features to existing software, or simply chasing a bug, AgitarOne can help you stay a jump ahead.
©2007 Agitar Software, Inc.
Ed Notes VOLUME 4 • ISSUE 8 • AUGUST 2007 Editor Edward J. Correia +1-631-421-4158 x100 ecorreia@bzmedia.com
EDITORIAL Editorial Director Alan Zeichick +1-650-359-4763 alan@bzmedia.com
Copy Editor Laurie O’Connell loconnell@bzmedia.com
Contributing Editor Geoff Koch koch.geoff@gmail.com
ART & PRODUCTION Art Director LuAnn T. Palazzo lpalazzo@bzmedia.com
Art /Production Assistant Erin Broadhurst ebroadhurst@bzmedia.com
SALES & MARKETING Publisher
Ted Bahr +1-631-421-4158 x101 ted@bzmedia.com Associate Publisher
List Services
David Karp +1-631-421-4158 x102 dkarp@bzmedia.com
Agnes Vanek +1-631-421-4158 x111 avanek@bzmedia.com
Advertising Traffic
Reprints
Phyllis Oakes +1-631-421-4158 x115 poakes@bzmedia.com
Lisa Abelson +1-516-379-7097 labelson@bzmedia.com
Director of Marketing
Accounting
Marilyn Daly +1-631-421-4158 x118 mdaly@bzmedia.com
Viena Isaray +1-631-421-4158 x110 visaray@bzmedia.com
READER SERVICE Director of Circulation
Customer Service/
Agnes Vanek +1-631-421-4158 x111 avanek@bzmedia.com
Subscriptions
+1-847-763-9692 stpmag@halldata.com
Cover Illustration by Dianna Toney
President Ted Bahr Executive Vice President Alan Zeichick
BZ Media LLC 7 High Street, Suite 407 Huntington, NY 11743 +1-631-421-4158 fax +1-631-421-4130 www.bzmedia.com info@bzmedia.com
Software Test & Performance (ISSN- #1548-3460) is published monthly by BZ Media LLC, 7 High St. Suite 407, Huntington, NY, 11743. Periodicals postage paid at Huntington, NY and additional offices. Software Test & Performance is a registered trademark of BZ Media LLC. All contents copyrighted 2007 BZ Media LLC. All rights reserved. The price of a one year subscription is US $49.95, $69.95 in Canada, $99.95 elsewhere. POSTMASTER: Send changes of address to Software Test & Performance, PO Box 2169, Skokie, IL 60076. Software Test & Performance Subscribers Services may be reached at stpmag@halldata.com or by calling 1-847-763-9692.
AUGUST 2007
If You Build It, They Will Come As I write this, I’m thinkAnd for enterprise ing of last night’s midapplications—particularly summer classic—the 2007 those with sensitive data All-Star Game—in which such as ERP, CRM and paythe National League was roll—comers could include an extra base-hit away from the malicious as well as the a dramatic ninth-inning, innocent. As a tester, it’s up come-from-behind win. to you to determine the But alas, the American roles the applications are League will again have intended to accommodate, home-field advantage in and to make sure they the World Series. allow and disallow access as Edward J. Correia I think that giving the appropriate. All-Star Game some meaning beyond In this month’s Security Zone spebragging rights is a good thing. Plenty cial section, software testing veteran of sports-savvy folks disagree, saying that Linda Hayes addresses this critical rolemany of the All-Star players come from based security issue. Cross-functional teams with no chance of making it to capabilities and tight integration require the Fall Classic, and therefore have no enterprise applications to provide secuimpetus to make it exciting. rity to specific components and data eleTo me, that’s like saying that because ments. This time-consuming role-based I have no personal data on my compatesting lends itself well to automation. ny’s Web site, I have no interest in keepThis article provides a guideline to ing it secure. organizing, defining and automating Nonsense. your role-based security testing. That argument assumes that players Exercise Your Unit Testing from one league or the other have no The remaining three articles this pride in—or loyalty to—their own month are all about unit testing. The league, and that just because their own first is the second and concluding team didn’t make it to the World Series, installment on continuous integration they won’t then root for another team from Jeffrey Fredrick, lead committer from their league. For example, if my of the CruiseControl project. In it, favorite team—the New York Mets— Fredrick presents a model of change doesn’t make it, I don’t automatically that explains why adopting unit testing root for the Yankees; I root for the is problematic, shows how underNational League team opposing them standing the model can suggest solu(unless it’s the Atlanta Braves). tions, and offers steps that can be impleAnd I believe that in addition to the mented to help a team successfully drive simply to win the game, players adopt unit testing. In short, how to train have loyalties similar to those we all your old team some new tricks. should have about our company’s appliAlso notable is “A Tale of Two Test cations, Web sites, corporate data, felTeams” from agile development guru low employees and other resources. Jeff Nielsen. On the surface, the two The phrase “If you build it, they teams appear alike. They have similar will come” from the fine 1989 film projects, similar tools and almost iden“Field of Dreams” is often quoted. But tical levels of automation. But spend a if “it” refers to your Web site, “they” day with each team and you’ll begin to surely means hackers. Is your site up notice some big differences. ý to the task? www.stpmag.com •
7
You’re Reading A Winner!
Software Test & Performance magazine is a 3-time winner in the 2007 American Inhouse Design Awards, from the editors of Graphic Design USA.
Its three winning designs were selected from more than 4,000 submissions, the most entries ever, according to organizers.
Contributors In the conclusion of his two-part article on continuous integration, JEFFREY FREDRICK, the top committer for CruiseControl, explores the training issues that usually accompany the move to unit testing and how to address them. See page 12. Part one of this article, which explains how to prepare your infrastructure for continuous integration, can be found in our May issue. CruiseControl is an open-source Java framework that facilitates a continuous build process with capabilities such as source control and e-mail notification plug-ins. Jeffrey is also head of product management for Agitar Software. In “A Tale of Two Test Teams,” JEFF NIELSEN takes us on a journey into the daily lives of a pair of teams building and testing a Java-based Web application. Though mostly similar in terms of development, the teams part company in the way they conduct their testing. Jeff explains that for an automated build system to be effective, it must have at least these three characteristics: Test and develop as a single activity; provide timely, relevant and consistent feedback; and involve everyone on the product team. The article begins on page 18. Jeff got his first software testing job in 1987, and is now agile officer at Command Information, where he coaches agile development teams of Fortune 1000 companies. MATT LOVE is a software development manager at test tools maker Parasoft. Beginning on page 26, Matt explains why JUnit is not the end-all for unit testing tools. This article will teach you how to leverage the benefits of several unit-testing techniques that go beyond JUnit as they apply to Java EE applications. Matt has been a Java developer since 1997, and has been involved in the development of Parasoft’s JTest unit-testing tool since 2001. He holds a bachelor’s degree in computer engineering from the University of California, San Diego.
With more than 20 years of experience in software test automation, LINDA HAYES has founded three software companies, including AutoTester, which produced the first PC-based test automation tool. She is currently the CTO of test automation tools maker WorkSoft. Cross-functional capabilities and tight integration require enterprise applications to provide security to specific components and data elements. In our Security Zone section beginning on page 33, Linda plays the part of protector as she delves into role-based security practices for testing enterprise applications. Linda is a frequent industry speaker and author on software quality.
BZ Media TO CONTACT AN AUTHOR, please send e-mail to feedback@bzmedia.com.
8
• Software Test & Performance
AUGUST 2007
Feedback AUTOMATED AND AVAILABLE At Scrumco, the second team in Edward J. Correia’s Test & QA Report newsletter article “Testing Without Requirements—Impossible?” (June 26, 2007), the developers should be using test-driven development. All members of the team would know what needed to be tested because that is the first document created for the project. This testing would be automated and available forever. Ed DiGiulio Fort Wayne, Ind.
GETTING THE RIGHT INFO Regarding Edward J. Correia’s Test & QA Report article “Testing Without Requirements—Impossible?” the challenge of getting the right information is a well-known problem in many companies. Even if you get the required documentation, what you really need is missing. I have seen a broad variety of documents with different kinds of quality (from excellence to unusable) but the experience I liked the most was a project that combined two very different approaches in two different phases of the project. The project started as an XP approach where the system was built from scratch;
CLASS-A JOB
as the tester, I worked with developers and the project leader. During that time I
Edward J. Correia’s “Testing Without Requirements—Impossible?” was a nice article, but I would add that even if you had documentation, the three steps outlined are always part of the key in doing a class-A job in testing. Hudson Robinson New York, N.Y.
learnt the application inside out, even though there was not much documentation. After the program was shipped, we shifted to phase 2, from XP to a more classical approach where the developer and the BA wrote a rough specification. After the initial version was built, I got release notes with background information written by the BA and developer. Even at a late stage in all the following phases, reading those release notes was enough for me to understand exactly which extensions were introduced and where I had to look for defects.
ROOM FOR IMPROVEMENT Regarding Edward J. Correia’s June 19, 2007, Test & QA Report newsletter “Will IBM Tend to Telelogic Test Tools?” I read this news with great interest. It could lead over time to some improvement in this sector. Earlier Telelogic products (unless they also purchased it from another vendor?) like Continuus have taken the term continuous in CM far before the mainstream caught its attention, but did a rather crappy job in Eclipse/WSAD integration—so bad that this tool completely vanished, and only its connector CM/Synergy has remained as a brand and has later been simplified into Synergy. There’s been a bit of progress toward more Eclipse Team compatibility, but [from] what I last saw of them, they still admitted not being team compliant and doing “their own thing” on top of Eclipse only. ClearCase may not be the best tool either, and many other Rational products have been far worse than what they promoted (for example, Rational Rose) while being market leaders at that time. Not becoming much leaner but clearly improving the integration into Eclipse after a full takeover, RAD is now one of AUGUST 2007
In another project that had poor documentation, target users tested the system every six weeks for a whole week, while we delivered fixes every day and agreed on new features for the next iteration. The product quality was amazing, even though RUP and XP were foreign words at that time. In cases where no or poor documentation is around, I ask developers to present the features they have introduced rather than forcing them to write poor documentation that contains only half of the required information. Rather than focusing too much on requirements documentation, it is more important to have the team close together and make sure you get the information you need to become a domain expert of the system you are testing. I do also agree that such a procedure may probably not work out in the aviation or the medical industries. Torsten J. Zelger Zurich, Switzerland IBM’s most important products. It is likely to take a long time to merge it into something like CC/Synergy (for ClearCase)—they might drop the Synergy or, as it happened to WSAD, the ClearCase—or maybe call it ClearSynergy. While other vendors diversify their IDE products, but also finally do a decent job like Borland (JBuilder 2007 or Together 2006), IBM seems to try adding more assets under their roof, not just the Eclipse Foundation. Borland and CodeGear might be
some of the reasons behind that attempt, but compared to some of their tools too (the ones not “rebranded”), neither ClearCase nor Synergy has much to hide. Werner Keil New Malden, Surrey U.K. FEEDBACK: Letters should include the writer’s name, city, state, company affiliation, e-mail address and daytime phone number. Send your thoughts to feedback@bzmedia.com. Letters become the property of BZ Media and may be edited for space and style. www.stpmag.com •
9
Out of t he Box
AmberPoint Gives SOA A Good Throttling AmberPoint, maker of test tools for SOA runtime governance, this month is set to begin shipping version 6 of its SOA Management System and SOA Validation System. The pair of suites delivers improved visibility and control of service-based applications as well as trafficthrottling policies and broader support for BEA, IBM and JMS-based systems. Greater visibility is presented through SOA Explorer, a redesigned Flex- and AJAX-based user interface. Color-coded representations of the network display operational data in real time that can be monitored and drilled down for closer inspection and repair. Views are organized into tabs. Services can be searched, sorted, filtered based on failure, performance and security, and compared; data can be exported. Testers gain new control over traffic and data flow in version 6 with policies that, according to company documents, “monitor system load and can automatically regulate requests” based on such parameters as response time, request rate and fault rate. This can be used to help prevent runaway demand and can give teams control over traffic
SOA Explorer is a new Flex- and AJAX-based interface for browsing and drilling into services.
from specific applications, users or services. “For example, at times of peak loads, update requests can continue unthrottled while search requests are kept in check.” AmberPoint’s SOA Validation System can now operate on end-to-end applications spanning multiple services, which the company says improves its ability to detect “potential runtime anomalies that might result from changes to the production environment.” Also new is support for command-
line calls from shell scripts and scripting tools such as Ant and Perl, as well as homegrown solutions. Automation can be further facilitated by integrating with third-party system-management tools and IDEs that use scripting. AmberPoint was set to begin shipping version 6 of SOA Management System and SOA Validation System about August 11, adding support for BEA AquaLogic Service Bus, IBM WebSphere ESB and JMS-based systems such as IBM MQ Series and TIBCO EMS.
froglogic Bestows Squish With Extension API Beginning with version 3.2 released in May, the Squish testing framework from froglogic can be extended and customized, including the addition of a set of extension APIs. The company in June posted version 3.2.1, a maintenance release with bug fixes. Squish is a cross-platform UI testscripting tool for applications made using AJAX, HTML, Four J’s, Java AWT/Swing, Qt, SWT/Eclipse RCP, Tk and XView. The company claims that tests scripts made with Squish “keep working when
10
• Software Test & Performance
the application under test evolves,” according company to documents. It supports JavaScript, Perl, Python and Tcl, “extended by test-specific functions, open interfaces, add-ons, integrations into test management systems,” an IDE for creating and debugging tests, and command line tools for automating it with test runs. The company said that the enhanced extensibility now provides complete control over object naming and identification, as well as more complex namematching algorithms, and enables users
to add support for custom controls such as “complex AJAX Web widgets.” The release also includes several enhancements specific to particular UI libraries. For example, Squish for Qt now supports Qt 4.3, Squish for Web (HTML /AJAX) supports nearly a dozen new AJAX toolkits including Backbase, DoJo, GWT, Infragistics, IT Mill, qooxdoo, Smart Client and Telerik, Squish for Tk supports PyTk, and the Java edition provides improved object-name identification. AUGUST 2007
Nothing Virtual About Better VMWare Lab Manager Automated setup, capture, storage and sharing of multi-machine software configurations are among the new features of VMWare Lab Manager 2.5, which began shipping in early July. Pricing starts at US$15,000; the tool works with VMWare Infrastructure 3, which lists for $35,000. Lab Manager works by pooling server, network and other resources, and allocating them as needed. Administration is done through a browser-based portal, which provides access to a shared library of machine configurations saved using delta-tree imaging to maximize storage space. Major enhancements to Lab Manager 2.5 are the addition of iSCSI and NFS storage options for VM libraries (adding to existing support for fibre channel SANs), and the ability to set policies that undeploy and clean up unused VMs and automatic calculation of freeable disk space. Virtual machine configurations can now be created, deployed and managed from 64-bit and virtual SMP guest operating systems. The tool now supports VMs running Solaris 10 on x86, and includes “experimental” support for Vista.
Tout Virtualization, Charge Accordingly One of the problems of virtualization technologies, which most analysts agree will see a huge wave of adoption in coming years, is their lack of cross-product management capability. Addressing that very problem is ToutVirtual, which in late May began charging for its VirtualIQ management product released a year ago as freeware. The new VirtualIQ Pro, an expanded version of the free VirtualIQ 525, starts at US$599 per year for as many as 10 processors (sockets) or 50 virtual machines, doubling the specs of its predecessor. It also adds Microsoft and Novell to its list of supported virtual server makers that already included VMWare and Xen. AUGUST 2007
The solution is available as an appliance or as software for Linux or Windows. Both platforms also are supported for management. Also new in Pro is real-time monitoring and reporting, policy-based actions and alarms, faster root-cause analysis, visibility and performance management of physical and virtual hosts, agentless operation, backup of virtual machines and historical reporting.
Zion Helps Colonize Buddy Lists If bots are your thing, Zion Software has updated its free JBuddy IM Toolkits to version 6, adding support for Microsoft’s Live Communications Server (LCS), an IM Bot Framework that simplifies the creation of automated agents activated through public or private instant messaging networks. JBuddy is a set of instant messaging and presence tools and APIs for COM, Java, .NET and ColdFusion MX 7 that permits developers and testers to build automated apps for data retrieval, alerts, launching test scripts and myriad other uses over IM networks. The tools support AIM/ICQ, Google Talk, Jabber (XMPP), Microsoft LCN, MSN Messenger, Lotus Sametime and Yahoo Instant Messenger, as well as the company’s own JBuddy Message Server. The IM Bot Framework allows “even nonprogrammers to create sophisticated IM Bots using XML at no cost,” according to a company document.
Longitude Version 5 Shows VM Heroics Performance and monitoring solutions provider Heroix in June began shipping Longitude 5, adding the ability to generate synthetic Web transactions and monitor virtual machines, the company says. Longitude monitors a company’s infrastructure and provides performance metrics on a variety of components, including applications, servers, networks and service-level agreements (SLAs). According to a news release, performance reports from Longitude 5 can now automatically link performance metrics
with the underlying events. This sort of self-annotation, Heroix says, allows drilldowns from within the reports. “Users can now add comments to reports and SLAs, such as background information about events or actions taken in cause, impact and corrective measures.” Synthetic Web transactions can be used to “measure how users actually interact with Web-based applications and incorporate those metrics into SLA monitoring,” the company says. This can give QA testers a view into user experience as it relates to particular infrastructure components. Also new is monitoring support for IBM Director, WebSphere 6, SQL Server 2005, network shares, log files and for VMWare virtual machine images and their impact on hardware. Version 5 can now automatically discover network applications and systems to be monitored, displays all resources graphically, and can sort them by geography, network topology or logical hierarchy. Longitude 5 is available now for Linux, Unix and Windows; pricing starts at US$299 per monitored system.
Solstice to Shorten SOA-Testing Days For companies building Java EE-based SOA systems, Solstice Software has enhanced its Integra Suite with end-to-end testing capabilities for complex architectures deployed using Web services, EJBs and JMS. Solstice claims to differentiate the release—Integra Suite 6—from competitors with the ability to test and validate endto-end services “including middleware connections across multiple applications and transports, and simulating unavailable application components,” according to a company document. The company claims to support any Java EE app server, and names JBoss, NetWeaver, Oracle iAS, WebLogic and WebSphere specifically. Also new in version 6 is the ability to automatically generate EJB test cases from existing interfaces and create SOAP messages that conform to specific security policies. Send product announcements to stpnews@bzmedia.com www.stpmag.com •
11
Sit! How To Train Your Team For Unit Testing By Jeffrey Fredrick
t first glance it’s hard to understand why more teams aren’t engaged in unit testing. Businesses today uniformly
A Teach an Old Test Team Some Great
push to respond to competition more quickly and deploy software faster and with fewer defects—in other words, to be more agile. It’s widely acknowledged that code bases that include unit tests are faster and easier to change, and have fewer bugs than those without. So the appeal of unit testing should seem not only logical, but irresistible. I’ve seen hundreds of teams attempt to adopt unit testing, but all too often fail. Why? From my experience, it’s mainly because the people leading the change think they need to solve only a technical or perhaps a training problem. Often they fail to address the people problems associated with change itself. In this article I’ll offer a model of change that explains why adopting unit testing is so problematic, show how the model can suggest solutions, and offer steps that can help a team find success in unit testing.
Hope vs. Reality
New Tricks
Unit testing is like any other practice—you can’t reasonably expect to dive in and be a master from the beginning. But most teams don’t consider how they’ll make the transition. They just blunder into the Jeffrey Fredrick is a top committer for the CruiseControl project and head of product management at Agitar Software.
12
• Software Test & Performance
AUGUST 2007
process: buy a few books, read a few articles, tell everyone to start testing and hope for the best. Implicitly they put their faith in a linear model of change that says everyone will improve over time—and, while the rate of improvement might be slower than they like, each day should be at least a little better than the one before it. This model of change appears simple and logical, and is therefore appealing. It’s also tragically wrong. The reality is that any fundamental change of practice requires a temporary drop in performance. This was true for Tiger Woods adopting his new swing, and it will be true for the programmer picking up a new development habit like unit testing. This drop in performance has been documented, and is illustrated by the Satir model of change (see sidebar, page 15). To get to the eventual benefit, teams need to work through this uncomfortable transition period when they’re actually slower than before. Sticking with the new habit through this period is difficult under any circumstances. When you’re constantly under schedule pressure, under the gun to get more done sooner, it’s even more tempting to fall back to familiar ways to get the job done. And this is the point at which both individuals and teams abandon their attempt at unit testing and return to their old methods. Thus, ironically, the very drive toward faster delivery that makes unit testing appealing actually retards many teams from successful adoption.
Fortunately you don’t need to be a passive victim of this productivity dip. A leader who’s aware of the challenge of change can choose strategies that make success far more likely. Four such strategies can be used to help ensure a successful transition. First and foremost, you’ll want to modify the shape of Satir’s change curve. Rather than paying the price of change all at once, it’s better to pay a little bit at a time, to successively achieve a series of new plateaus where the team is experiencing smaller improvements faster (see Figure 1). With a series of small improvements, not only is each one more likely to be successful, later changes are compounded by prior successes. This also helps remove the sting of the temporary drops in performance and makes them shorter. Second, designate a point person to prepare the infrastructure and work through obstacles. This will help make sure that the process changes made by the rest of the team yield their maximum return and mini-
Illustrations by Dianna Toney
The Challenge of Change
UNIT TEST TRAINING
process: buy a few books, read a few articles, tell everyone to start testing and hope for the best. Implicitly they put their faith in a linear model of change that says everyone will improve over time—and, while the rate of improvement might be slower than they like, each day should be at least a little better than the one before it. This model of change appears simple and logical, and is therefore appealing. It’s also tragically wrong. The reality is that any fundamental change of practice requires a temporary drop in performance. This was true for Tiger Woods adopting his new swing, and it will be true for the programmer picking up a new development habit like unit testing. This drop in performance has been documented, and is illustrated by the Satir model of change (see sidebar, page 15). To get to the eventual benefit, teams need to work through this uncomfortable transition period when they’re actually slower than before. Sticking with the new habit through this period is difficult under any circumstances. When you’re constantly under schedule pressure, under the gun to get
14
• Software Test & Performance
FIG. 1: MODEL OF ITERATIVE CHANGE Software Development Capability Desired
Piecemeal change Current Capability dip
Time • Smaller steps can be faster • Reduced time to payback investment causes virtuous cycle
more done sooner, it’s even more tempting to fall back to familiar ways to get the job done. And this is the point at which both individuals and teams abandon their attempt at unit testing and return to their old methods. Thus, ironically, the very drive toward faster delivery that makes unit testing appealing actually retards many teams from successful adoption.
The Challenge of Change Fortunately you don’t need to be a passive victim of this productivity dip. A leader who’s aware of the challenge of change can choose strategies that make success far more likely. Four such strategies can be used to help ensure a successful transition. First and foremost, you’ll want to modify the shape of Satir’s change curve. Rather than paying the price of change all at once, it’s better to pay a little bit at a time, to successively achieve a series of new plateaus where the team is experiencing smaller improvements faster (see Figure 1). With a series of small improvements, not only is each one more likely to be successful, later changes are compounded by prior successes. This also helps remove the sting of the temporary drops in performance and makes them shorter. Second, designate a point person to prepare the infrastructure and work through obstacles. This will help make sure that the process changes made by the rest of the team yield their maximum return and minimize distractions caused by technical glitches. This allows the team to benefit from this preparation work without feeling the investment of time. Third, set up a system of fast feedback and reward accomplishments. In animal studies of change, it has been revealed that fast feedback with small rewards is a more potent tool for change than a large reward offered later. With unit tests, there are two accomplishments that you should reward with fast feedback. First is progress toward developing the test suite. Every test that is created is an accomplishment that AUGUST 2007
UNIT TEST TRAINING
you’d like your system to reflect. Conversely, you should also reward test failures. It may seem odd, but the point of creating unit tests is to help you catch mistakes early. And it is through failure that you learn about mistakes. The faster this feedback cycle occurs, the more your unit testing efforts will be reinforced. Finally, in charting your path to unit testing, you’ll want to leverage the power of social proof. In times of uncertainty, such as when a team is attempting a new practice, people are strongly guided by the behavior of others when deciding how to behave themselves. If you see the people around you abandoning the change, you’ll be inclined to follow suit. But if you see that others on the team are working through the challenges to create unit tests, your own resolve to do so will be reinforced. Putting all of these ideas together, an implementation plan for adoption of unit testing would follow these steps: • Create a machine-independent build that can be run by all developers • Begin running the build under a continuous integration tool to get the team used to responding to the build pass/fail status • Update the build so that it will run any tests that are created and report progress and status • Create the initial tests to provide an example for the rest of the team Each of these steps will bring its own reward, and each could be implemented by small subsets of the team. When they’re all in place, the ground is prepared for the rest of the developers to get the most from tests with the least effort.
developers is Ant (ant.apache.org), which is what I used for my example. Ant outputs XML build files that define a set of steps or targets that can be called. Each step is comprised of a series of tasks to be executed. The tool supports a number of build-related tasks and is supplemented by an ecosystem of third-party scripts, such as those to link with various source control systems. For our example, a very simple Ant build script, typically named build.xml, would be something like this: <project name=”my_project” default=”compile”>
F
ROM CHAOS TO SUCCESS Virginia Satir created her model of change in her role as a family therapist, and it was later adapted into the vocabulary of software organizations by Gerald Weinberg. As normally described, the model predicts that a disruptive element will create a time of chaos with lowered performance. This period will be followed by a time of integration until a new plateau of performance is achieved. At the level of individual practice, this drop in performance reflects the effort in taking a conscious action and converting it into an unconscious habit. Here’s how: Before trying unit testing, create a set of daily habits at which you’re proficient. As you begin to apply the new technique, slow down as you consciously think your way through what needs to be done. As you learn unit testing, you’ll require less and less conscious thought to perform it, until eventually you’ll be fully proficient and performing at a high level.
Software Development Capability
The Satir Model of Change
Desired
Typical change Current Capability dip
Step 1: Machine-Independent Build The first step to establishing a unit testing program is establishing a common execution environment that can be executed from anywhere (also see “Pickling Your Builds,” July 2007). It doesn’t matter which technology you use for your build. The popular choices include Ant or Maven tools, Perl or Python scripting languages, and domain-specific languages such as Rake and Make. The critical requirement is that anyone on the team can run the build from any machine and get the same result. A common build tool for Java AUGUST 2007
mand line, go to the directory where the build.xml file existed and type ant. Assuming that Ant is installed and on the execution path, Ant would then: • Find the default target “compile” and see that it depended on the target “clean” • Find clean and execute it, which would delete and re-create the output directory • Execute compile, which would compile any Java source files found in the directory “src” and put the resulting class files in “output/classes”
Time
• Change generally results in an initial productivity dip • Too long before payback can result in abandoning change
<target name=”compile” depends=”clean”> <javac srcdir=”src” destdir=”output/classes” /> </target> <target name=”clean”> <delete dir=”output” /> <mkdir dir=”output/classes”/> </target> </project>
If your directories are set up correctly and your source files all compile, you’ll see Ant execute the steps described and then finish with the words BUILD SUCCESSFUL. If there’s a problem, such as a compile failure, you’ll get an error message similar to this:
To invoke this script from the comwww.stpmag.com •
15
UNIT TEST TRAINING <pathelement path=”output/classes”/> <pathelement path=”lib/junit-3.8.2.jar”/> </classpath> </javac> </target>
This tells Ant that compiling the tests depends on first compiling the application classes. Once the classes are up-todate, your tests are compiled with the application classes and JUnit as dependencies on the classpath. After compiling, you then run your tests by adding a new target: <target name=”test” depends=”compile.test”> <mkdir dir=”output/test_results”/> <junit printsummary=”on” failureproperty=”junit.failed”> <classpath> <pathelement path=”output/classes”/> <pathelement path=”output/test_classes”/> </classpath> <formatter type=”xml”/> <batchtest todir=”output/test_results”> <fileset dir=”test” includes=”**/*Test.java”/> </batchtest> </junit> <fail if=”junit.failed” message=”one or more junit tests failed”/> </target>
With this configuration, you’re telling CruiseControl that you have a project named my_project checked out from Subversion (svn), that the files should be updated automatically by the bootstrapper, and that when changes occur, you should build by running Ant in the my_project directory. When the build runs, you can see the status of the project by looking at the CruiseControl Web interface (see Figure 2). However, you wouldn’t want to rely solely on manual checks of this page by the team. Fortunately, you can choose from a wide variety of options to keep everyone informed. Builds can be monitored using a number of third-party tools, including plug-ins for Firefox and Thunderbird, widgets for Yahoo and dashboards for Mac OS X. Also, a number of publishers plug into CruiseControl to keep the team upto-date. Built-in publishers can send email, use instant messaging or operate electronic devices using the X10 home automation device interface. The same mechanisms used to keep on top of compile failures also can be used for test failures.
matically. It might seem strange to be putting this infrastructure in place before there are tests, but this preparation is intended to work out most of the kinks before the team begins testing. Modify the previous Ant script so that any tests created are compiled and executed, and that if a test fails, the build fails. First, add a new target to compile the tests: <target name=”compile.test” depends=”compile”> <mkdir dir=”output/test_classes”/> <javac srcdir=”test” destdir=”output/test_classes” <classpath>
Here you’re telling Ant that your test classes are defined in all the Java source files under the test directory that end with the word “Test.” With this information, Ant will find the names of all the test classes, run them and write the results to XML files in the test_results directory. If any of the tests fail, it will set a property named “junit.failed.” After the JUnit tests are run, you then tell Ant to check if that property has been set—and if so, the build should fail. With this script in place, adding new tests is easy for developers. All
FIG. 2: THE CRUISECONTROL WEB INTERFACE
Step 3: Update the Build for Tests Next, prepare the build script so that any tests committed will be executed and any failures will be reported auto-
16
• Software Test & Performance
AUGUST 2007
UNIT TEST TRAINING
Also, a number of publishers plug into CruiseControl to keep the team upto-date. Built-in publishers can send email, use instant messaging or operate electronic devices using the X10 home automation device interface. The same mechanisms used to keep on top of compile failures also can be used for test failures.
Step 3: Update the Build for Tests Next, prepare the build script so that any tests committed will be executed and any failures will be reported automatically. It might seem strange to be putting this infrastructure in place before there are tests, but this preparation is intended to work out most of the kinks before the team begins testing. Modify the previous Ant script so that any tests created are compiled and executed, and that if a test fails, the build fails. First, add a new target to compile the tests: <target name=”compile.test” depends=”compile”> <mkdir dir=”output/test_classes”/> <javac srcdir=”test” destdir=”output/test_classes” <classpath> <pathelement path=”output/classes”/> <pathelement path=”lib/junit-3.8.2.jar”/> </classpath> </javac> </target>
This tells Ant that compiling the tests depends on first compiling the application classes. Once the classes are up-todate, your tests are compiled with the application classes and JUnit as dependencies on the classpath. After compiling, you then run your tests by adding a new target: <target name=”test” depends=”compile.test”> <mkdir dir=”output/test_results”/> <junit printsummary=”on” failureproperty=”junit.failed”> <classpath> <pathelement path=”output/classes”/> <pathelement path=”output/test_classes”/> </classpath> <formatter type=”xml”/> <batchtest todir=”output/test_results”> <fileset dir=”test” includes=”**/*Test.java”/> </batchtest> </junit> <fail if=”junit.failed” message=”one or more junit tests failed”/> </target>
Here you’re telling Ant that your test classes are defined in all the Java source files under the test directory that end with the word “Test.” With this information, Ant will find the names of all the test classes, run them and write the results to XML files in the AUGUST 2007
test_results directory. If any of the tests fail, it will set a property named “junit.failed.” After the JUnit tests are run, you then tell Ant to check if that property has been set—and if so, the build should fail. With this script in place, adding new tests is easy for developers. All they need to do is create the test file with a name that matches your naming pattern and put it under the test directory. Your build will automatically find and execute it. If anyone checks in a failing test or
• The pioneer blazes the trail for the rest of the team, ironing out any kinks in the armor, so the developers who follow won’t have to.
• commits a source change that causes a test to fail, CruiseControl will mark the build as failing. It also can let you know which test was to blame. This is done if you specify where to find the XML files created by JUnit; that they can be merged into the build log file. You can do that by adding the following project information in the config.xml: <log> <merge dir=”projects/my_project/output/test_results” /> </log>
With
that
information,
Cruise
Tale of Two
A
By Jeff Nielsen
utomated testing, automated builds and continuous integration are increasingly accepted as software industry best practices.
A
Numerous tools—both open source and commercial—exist to help companies start automating tests of all kinds. Other tools make it possible to compile, package and deploy code, launch test suites automatically, and build multiple versions of a system, all with little or no human intervention. While these practices and tools can be powerful, many companies and teams struggle to implement an effective system of automated builds and tests. They discover that simply having the pieces in place isn’t enough to get real value from their automation efforts—in particular from automated tests. In some of the worst automation attempts I’ve seen, people create more problems than they solve. The solution here is usually to examine and improve the feedback loops. To realize value from automated tests and builds, all pieces have to be integrated into a system of effective feedback. Drawn from my consulting experience of the last several years, the following “tale of two teams” illustrates how two similar groups of developers and testers working on similar projects can have very different results. See which team is most like yours.
Meet the Players Team A is building a Java-based Web application. The team consists of five pro-
grammers, a full-time tester from the QA department, a business analyst and a project manager. They sit together during the workday in a large “war room” area. The developers on Team A have written a fairly comprehensive set of unit and integration tests, using the JUnit framework. They’ve also developed a build script that automates such tasks as compiling, checking code into and out of the repository, building the EAR file, running the tests, and stopping and starting the application server. Team A’s tester has put considerable effort into automating functional tests— tests that exercise the system through the user interface—using a popular opensource tool. This set of tests is helpful in keeping up with the growing task of regression-testing the system. The team has a server in one corner of their area running an open-source continuous integration system that monitors the source code repository and automatically kicks off a build each time code changes are committed. Team B is also building a Web application in Java with roughly the same number of people. They too sit together, with all of their cubicles in a common area. Like Team A, they have a suite of JUnit tests and a tester who is proficient at functional test automation. They are proud of their sophisticated build script (created by a consultant using the latest
Getting the Most Value From 18
• Software Test & Performance
AUGUST 2007
Test Teams Unit-Testing Behaviors The first thing we observe about Team A is that almost all of the developers use the JUnit tests as part of their minute-byminute work. They run the tests as they write their code and add new tests as they go. They currently report unit test coverage for about 90 percent of the application. While there are occasional complaints about a test that’s difficult to understand or one that doesn’t seem to be doing much, the developers by and large express confidence in their tests, which they agree have saved them from making mistakes on numerous occasions. On Team B, the developers have more of a love-hate relationship with their JUnit tests. They believe in the value of unit tests, but they don’t appear to run them much while they work. (Though they wouldn’t admit it, there are two developers who sometimes delete the JUnit tests in their local development environment so that they don’t “get in the way.”) Jeff Nielsen is chief scientist at Digital Focus, where he leads and coaches agile development teams for Fortune 1000 clients.
We notice that the tests are mostly run by Team B from the command line when developers reach a natural stopping point, and we hear a fair amount of grumbling about the additional effort needed to maintain the test code. The amount of coverage from the JUnit tests is a constant source of friction on Team B. The one senior developer who believes strongly in unit testing does his best to add as many tests as possible, but there are big sections of the code with little or no coverage. The others would like to write more tests, but explain that they rarely find the opportunity to do so. The unspoken question on everyone’s minds is whether the time spent creating tests is always worth it.
Integration Behaviors Back on Team A, we see the developers working in small chunks, trying to check in new or changed code every couple of hours. Even though all five are working with the same set of source files, they find that they’re usually able to avoid stepping on each other. No one seems to be afraid to update their local environment frequently with the latest changes from the source code repository, because this rarely causes problems. When a test does fail unexpectedly on a developer’s machine, it’s attributed to something the developer didn’t understand about someone else’s code. The problem is usually resolved within a few minutes. Over on Team B, things don’t run as smoothly. Although most of the devel-
Illustration by Perttu Sironen
tools) and their impressive-looking continuous integration software. On the surface, these two teams appear very much alike. They have similar projects, similar tools and almost identical levels of automation. But spend a day with each team and you’ll begin to notice some big differences.
Automated Tests and Builds AUGUST 2007
www.stpmag.com •
19
FORMULA FOR FEEDBACK
Feedback Changes Behavior
opers manage to check in code at least once a day, they find the update/checkin process somewhat painful. They regularly experience merge conflicts, where one developer’s changes overwrite or break another developer’s work. This makes them cautious about when and how often they update. Dealing with failing tests at integration time is another hot issue for Team B. There are a handful of tests that “always fail” in the integration environment—something to do with the test data not having been set up right—so these are ignored. They do try to fix any additional failing tests that they notice, but they’ve been known to delete a test or two in the name of expediency. If a developer is feeling really rushed (like at the end of the day), he might bypass the “official” check-in process altogether and just commit a few changed files to the repository.
QA Behaviors Testers on the two teams also exhibit very different behaviors. Team A’s tester splits time between hands-on testing of new features and creating automated test scripts for completed ones. With the functional and regression tests being run hourly by the continuous integration server, Tester A is free to spend time checking for problems that are best identified visually (spelling, layout and so on), and come up with creative ways to try to break the system. Tester A enjoys interaction with the developers, who willingly participate to quickly turn around fixes. It’s not uncommon to have typos and other small defects fixed and verified within 10 to 15 minutes
22
• Software Test & Performance
of notifying the team. In contrast, Team B’s tester is constantly in turmoil, spending most of the time waiting for the developers or the build to finish. If there’s a problem with the build—about 1/3 of the time—Tester B is usually the first one to notice. Tester B wants to spend more time on exploratory testing, but is already unable to keep up with reporting the “obvious” problems—a page that has stopped loading, for example. It’s been more than a week since any automated tests have been built, and Tester B is behind on fixing the existing ones. In summary, Team A’s automation tools help them to work quickly and efficiently, with only minor hiccups along the way. They usually go home feeling like they have made good progress. On Team B, however, the automation actually seems to get in the way. The team moves in fits and starts, with some days almost entirely consumed by resolving integration problems and debugging broken tests. Ironically, when it comes time for a demo or a release, Team B’s automated tests are often abandoned in a flurry of frantic activity in order to “get things done.”
•
So what’s the difference? Why does one team seem to be able to harness the power of automation successfully, while for the other it is a hindrance? We might initially attribute the behavioral differences simply to a lack of skill or discipline. Maybe the people on Team B either don’t know or care enough to do the things that will make them more successful. The answer is more complex. Blaming failure on stupidity or laziness usually isn’t an effective way to change behavior. A better strategy is to assume that people generally want to do the right thing, and to look for problems in the supporting structures that might be preventing it. If we focus on system deficiencies rather human ones, we can often find ways to make it easier for people to do the right thing. Having personally seen many instances of Team B, I don’t believe their problem lies with a lack of ability or motivation. I also don’t think it has very much to do with their choice of tools. Instead, Team B’s biggest problem is a lack of effective feedback in the way they use their automation tools. Simply put, Team A has optimized their feedback loops, and Team B has not. Team B needs to pay careful attention to improving both the amount and type of feedback they’re getting from their automated tests and builds.
People want to do the right thing—so look for problems in
Feedback Brings Accord
Consider a non-technical example of the power of effective feedback to bring about specific behavior. I have a problem remembering to turn my headlights off when I get out of the car. Evidently, I’m not the only person with this problem, since automotive engineers seem to have spent years trying to figure out how best to remind people to do this. As technology progressed, many cars started turning off the lights for you. While my aging Honda Accord doesn’t have this latest innovation, it still has a good feedback loop built in. If I turn off the ignition with the head-
the support structures that might be preventing it.
•
AUGUST 2007
FORMULA FOR FEEDBACK
than the test that fails hours later. Effective feedback must be consistent. Inconsistency drives human beings crazy. We learn to trust those systems that work consistently and to be wary of those that don’t. With software systems, inconsistent feedback manifests itself with tests that pass sometimes but not all of the time, or with builds that sometimes break for
lights on, when I attempt to exit the driver’s-side door, I immediately get a “ding-dong” sound that I don’t normally hear. This very small difference has saved me from many a dead battery on occasions when I’ve leapt out of the car in a hurry to get somewhere. Although my human tendency to forget to check the headlights hasn’t changed, the immediate audio feed-
• The test that fails within seconds after writing incorrect code is much more useful than the test that fails hours later.
seem particularly important or noteworthy. For example, if the output of a failing test closely resembles the output of a passing test, we’re likely to miss seeing the failure. In contrast, the “red bar/green bar” model popularized by JUnit provides a clear distinction that immediately catches the eye. All three of these characteristics are essential elements of effective feedback. If even one of them is lacking, the feedback becomes much less useful. This is why tuning feedback loops is such a subtle exercise. The difference between a working feedback loop and an ineffective one can be very small. For example, a little over a year ago, my trusty Accord ceased to be consistent with the headlight warning sound. It’s still timely and attention-grabbing when it works, but for some reason the
• back is sufficient to get me to do the right thing (i.e., turn them off). I’ve given some thought to what makes this feedback so effective. First, the feedback is timely. I hear the warning sound the instant that I’m about to do something wrong when there’s still time to do something about it. Second, the feedback is consistent. Every time I’m about to leave the car with the headlights on, I get the same warning sound. And third, the feedback catches my attention because it’s unexpected. If my car were to always play some kind of sound when I exited, I’d quickly learn to ignore it.
Creating Effective Feedback The Honda example demonstrates the three characteristics of effective feedback: timeliness, consistency and command of attention. These can similarly apply to the world of automated tests and builds. Effective feedback must be timely. Feedback is more powerful the sooner it arrives. After an event, there is often a brief window of opportunity in which feedback will be received and appreciated. The farther apart the stimulus and the response, the less likely we are to associate the two and to learn something. The test that fails within seconds after writing incorrect code is therefore much more useful AUGUST 2007
apparently random reasons. If we can’t trust that a failing test is indicative of a real problem, that test becomes useless as a warning mechanism. Effective feedback must be attentiongrabbing. We need to take human psychology into account when designing feedback. Most people have so much information coming at them that it’s difficult to choose what to pay attention to. We actually learn to tune out most things, giving attention only to those inputs that
“ding-dong” stopped playing some of the time. Although I haven’t bothered to investigate yet what’s wrong, this one little change in the system has rendered the feedback loop essentially useless. This small difference has cost me both time and money on numerous occasions due to dead batteries.
Helping Team B Understanding that effective feedback needs to be timely, consistent and www.stpmag.com •
23
FORMULA FOR FEEDBACK
attention-grabbing, let’s see how we might solve some of Team B’s problems by applying these principles. We’ll look at a couple of examples where we can identify which characteristic of effective feedback is missing, and then discuss how to rectify each one. Use of unit tests. Let’s first look at improving Team B’s unit-testing situation. The problems we initially observed were that the developers don’t usually run the JUnit tests while they’re coding and that the tests require lots of extra effort to write and maintain. On closer examination, we discovered that developers don’t run the tests often enough because they take too long. The only facility they have in place is a build-script command that requires a good two minutes to run the entire suite of JUnit tests. While 120 seconds might not seem like a long time, it’s an eternity when you’re in the middle of coding. The combination of having to switch to a command prompt and then wait for a couple of minutes is more than enough to interrupt any programmer’s flow of thought. In this situation, the feedback from the JUnit tests is not timely. Think how much more effective the test feedback would be if a developer could run them every time even a few lines of code were changed. This would also make it much easier to develop additional tests while working. For a developer to be able to use the tests while coding, however, there needs to be an easy way to run the tests within the development environment (IDE) in literally seconds. This is achievable, but it does require some work. First, Team B needs to standardize on a single IDE for all developers so that they can develop and share best practices on running the tests while they work. They should figure out how to take advantage of the built-in IDE support for collecting and running JUnit tests. And they need to ensure that there is a simple way to launch the tests—no matter what they happen to be working on. Most importantly, the team needs to find a good way to separate the slower tests (those that talk to a database or communicate across a network, for example) from the fast ones, so that
they can run all of the fast tests every minute or two. One way to do this is to use test naming or annotation conventions, then use a test runner that recognizes those conventions. I’ve worked on several projects in which, with a single keystroke, a develop-
•
not consistent. All this contributes to a basic skepticism of the test results. When a test fails at check-in time, the team members are never sure if it’s a real problem or just one of the “flaky” tests. They don’t know whether it’s a problem they introduced or something someone else did (the inclination, of course, is to blame the other guy). Because they can’t trust the tests, they often ignore them and thus miss noticing a real problem. Team B’s first step to improve this feedback loop is to temporarily disable all the tests that don’t give the same results on every developer’s machine and in the integration environment. When it comes to feedback, an inconsistent test is worse than no test. Then someone needs to take the time to figure out why the offending tests are inconsistent and fix them. Most likely, the problem tests need to be re-written to be more selfcontained and independent. It means removing static variables, taking out dependencies on configuration flags, eliminating reliance on specific dates or times, and so on. Wherever possible, tests also need to be rid of dependencies on any data or environment now explicitly set up by the tests themselves. Once Team B members can trust that a failing test means that something is truly wrong, they need to be diligent about watching for unexpected failures in the future. As soon as they experience a false negative, they need to stop, figure out what went wrong and fix the problem before it happens again. This takes time and effort, but it’s an investment that pays for itself many times over. Much more time is wasted dealing with inconsistent feedback. Paying attention. Now let’s ask why the tester on Team B is usually the first to notice when the latest build is broken. Their continuous integration server is supposed to report that. Any time new or changed code is checked into the repository, the CI server retrieves a clean copy of the code, compiles it, runs unit tests (both fast and slow ones), and builds and deploys the runtime artifacts. It only
To use the tests while coding, you need an easy way to run the tests within the IDE in literally seconds.
24
• Software Test & Performance
• er could compile code and run more than 1,500 JUnit tests in around 10 seconds. Once something like this is in place, it becomes a powerful feedback loop. The increasingly popular discipline of test-driven development is based on the simple principle that the more often you can run your code, the better chance you have of getting it right. It’s surprising how many teams neglect fine-tuning the fundamental feedback loop of edit-compile-test. The speed at which you can compile, run a set of tests and get simple yes/no feedback affects almost all other aspects of development. Check-in and merge. Next, let’s look at the pain Team B is having with checking in and merging their changes. Certainly being able to run the JUnit tests more frequently as they develop will help with this. But we have another problem to contend with. Some tests pass on one developer’s machine but fail on another’s. Other tests pass when run independently but fail when run together in a suite. Other examples of such “brittle” tests are those that fail unless there is some very specific data in the database (data that the integration database never seems to have). Then there are those tests that just seem to fail randomly at times—no one quite knows why. In other words, feedback from the JUnit tests at integration time is
AUGUST 2007
FORMULA FOR FEEDBACK
takes about six minutes to run (timely enough for this situation) and it should be sufficient to flag many kinds of problems—particularly after the team does the work to make the test results consistent. The glaring problem here is that no one on the team seems to be paying attention to the feedback from the continuous integration server. Apparently this feedback is not attention-grabbing. The continuous integration server sends e-mails to notify the team when something goes wrong with the build. But an overabundance of messages has caused these e-mails to be treated as little more than spam. (Remember those tests that were always failing!) Most of Team B either deletes, ignores or filters the build e-mails into a different mailbox. There are better ways to provide feedback. Many teams have found it more effective to use sounds as an integration feedback mechanism. With this kind of setup, the continuous integration server plays an unpleasant audio file when something goes wrong with the build, and a pleasant one when all is well. Failed builds can be heralded by glass breaking, a siren or even Homer Simpson’s “Doh!” The only requirement is that the signal be loud enough for the whole team to hear from anywhere in the team area. This way, team members are notified in a simple way about failed and successful builds without even needing to be at their computers. Other teams use ambient orbs or strings of Christmas tree lights to turn the room red or green depending on the status of the build. The casual observer might smirk at these methods, but they work because they’re effective at grabbing people’s attention.
•
action between the tester and the developers on Team B. All three feedback characteristics could be used to help them work together more effectively. The principle of timely feedback dictates that the tester shouldn’t have to wait more than a minute or two for a new build to be pushed to the test environment. The principle of consistency urges developers to provide a more reliable way to build and deploy the system so the tester can trust them when they say a new build is ready. And the tester might use the attention-grabbing mechanism of writing newly discovered bugs on a whiteboard, rather than entering them only into a bugtracking system. The approach is the same in every case. For each feedback loop, we ask: • Who needs the feedback? • How long are they willing to wait for it? • What’s the best medium in which to present the feedback? Getting the most value from automated tests and builds requires fine-tuning of feedback loops. For feedback to be effective, it has to be consistent, timely and attention-grabbing.
Failed builds can be heralded by glass breaking, a siren or even Homer Simpson’s ‘Doh!’
Applied Elsewhere The three principles of effective feedback can be applied to other areas as well. For instance, consider the interAUGUST 2007
•
Becoming the A Team Like Team B, many companies forget that automation tools are not an end, but only a means to that end. Unless properly applied, these tools lose much of their promised value. The difference between teams that are successful with automation and those that aren’t is the level of attention paid to getting feedback right. Amazing things happen when you start providing consistent feedback to the right people, at the right time, and in a manner that their brains can process. Greater productivity, improved quality and increased overall satisfaction are frequently the reported benefits. Better yet, you may discover that your people are “A Team” material after all. I pity the fool who doesn’t provide effective feedback. ý www.stpmag.com •
25
By Matt Love
W
hat’s the first thing that comes to mind when you think about unit testing? If you’re a Java developer, it’s probably JUnit, since the
tool is generally recognized as the de facto standard for Java unit testing. However, JUnit doesn’t necessarily define the testing methodology for which it may be leveraged. The tool can be used to associate a one-to-one mapping between every test class and tested class in a code base, or it may be used to manage a set of end-to-end tests for an entire Java EE application. But if a development group is asked, “How is this software tested?” and they respond simply that it’s unit-tested or tested with JUnit, have they really answered the question? The emergence of Java EE frameworks for everything from server-side business logic to data persistence forces the requirement for specialized tests that fit an application’s frameworks. JUnit is becoming a commonality between specialized testing frameworks for the ever-expanding domain of Java EE applications. Many of these frameworks can be Matt Love is a software development manager with Parasoft. He has been involved in the development of Jtest since 2001.
well tested using proper setup and deployment of JUnit tests that have been supplemented with a frameworkspecific test harness. However, when additional test harnesses are involved, JUnit is no longer sufficient to identify the means and breadth of testing. This article will teach you several unit-testing techniques that go beyond JUnit as they apply to Java EE applications.
1. Identify the Goals The first step to implementing a testing solution is to clearly define the goal and scope of each test. After these have been established, the next step is to investigate which implementations of testing for Java Enterprise frameworks are best suited to achieve that goal. Regression testing. The adage “If it wasn’t tested, it probably doesn’t work” is true for any software development project. The most important question that needs to be answered by software testing is “Does this application work?” Any type of test can verify if software is
functioning correctly according to the specification. Typically, when new functionality is added, a manual test is written to ensure that the new code has the desired effect. That manual test will answer the immediate question about behavioral correctness, but the application may not work the next day, after the code base has changed. Automated tests provide confidence that existing specifications are still satisfied as the code base evolves. Beneficial code optimization, reorganization or new features are often postponed or rejected because of low confidence in the software’s functional correctness. Such changes are commonly resisted when sufficient tests aren’t available to verify that the changes didn’t break previously working functionality. A good regression suite will not only catch new errors introduced into previously correct functionality, but will provide the confidence needed to make significant modifications faster. Automatic tests are no replacement for manual QA, but they do provide added confidence that certain features are known to be well tested. This allows QA to focus on
Four Steps For Taking Unit Testing To the Next Level For Java EE
www.stpmag.com â&#x20AC;˘
27
UNIT-TESTING JAVA EE APPLICATIONS
testing the higher-risk features. Unexpected behavior. Sometimes testing is performed to vet unexpected behavior so that bugs can be identified and fixed before they reach the end users. This is a smart goal because leaving such problems for the end user to discover is typically much more expensive—both in terms of the impact on the organization’s image and the resources required for a patch release. The best results in testing for unexpected behavior are achieved by a tester other than the person who wrote the code, since such a tester is more likely to think “outside the box” of the original specification. An automated testgeneration tool that has no knowledge of the specification is a prime candidate for performing this type of testing. These tools can help you test for unexpected behavior by designing and executing tests that check how the program handles unexpected stimulus and boundary conditions. Tests with unexpected inputs or outcomes can be used as regression tests once they’ve been reviewed and incorporated into the specification. The end result is a large suite of regression tests that are used to identify when a unit’s functional behavior changes or to verify that all units are functioning properly. Test-driven development. Testdriven development (TDD) is another popular term that comes up when discussing unit testing. When taken to the extreme, TDD means writing tests before writing code. This is difficult to do when the tests need to make programmatic calls to tested code that hasn’t yet been written. Granted, this meets the TDD goal of tests that initially fail (because compilation errors are considered failures). However, compilation errors in tests tend to interfere with the rest of the test suite. This is especially true in Java EE systems, where the test suite is compiled to a JAR, EAR or WAR file and deployed in a container. None of the tests can be
28
• Software Test & Performance
deployed if there are any compilation errors. This forces test code and tested code to be written at the same time, and thus fails to comply with pure theoretical TDD, which mandates that tests be written before the code. A more practical approach is to apply TDD to functional tests for problems found in an application’s current functionality. TDD procedures fit into the QA cycle very nicely: • Manually reproduce a problem report • Reduce the problem to identify the problematic unit(s) • Write tests to automatically reproduce the problem in the problematic unit(s) • Verify that the tests fail as a result of the problem • Write the necessary code to fix the problem • Verify that the same tests pass as a result of the fix To truly adhere to TDD, these steps need to be followed for every problem report. The same methodology can be applied to new features in a practical manner, as long as there’s a means for the tests to compile and run. If it’s not practical to create tests before creating a new feature, the tests should be put in place immediately after the feature is created to serve as regression tests and verify the specification. The second step attempts to define
• The best results are achieved
other points of failure. Otherwise, maintenance for full end-to-end system tests will be overwhelming. Imagine tests that compare a program screenshot to a saved control. Even the smallest change in presentation will require that all tests be reviewed and updated. Building a suite of tests that operate on the smallest functional units provides the best return for the lowest maintenance overhead. However, these functional units and the associated tests can become very large in Java EE applications when a problem exists in the integration between several components.
2. Define the Scope Borrowing an analogy from the animated feature film Shrek, it’s safe to say that unit tests are like onions and ogres: They all have layers. Unit tests may be performed at the method, class, component or integration level. Even some complete end-to-end system tests can be organized using JUnit in such a way that the unit tested is related to a unit of specification that exercises the whole system. The majority of tests should be for as small a unit as possible to meet the associated goal. Testing small units often exposes problems that aren’t obvious when testing larger components or systems. However, an application is likely to fail when there are no tests for the layers where the units of code interact. Integration testing will verify that each unit is not only functioning correctly on its own, but also that the units are connected correctly in the application. Integration testing for Java EE applications involves testing interactions with third-party systems that are assumed to be correct. Third-party systems may not be easily changeable, so components under development must detect and work around any third-party flaws. A Java EE testing strategy isn’t complete until it includes tests at every layer of the system. Code-level. Testing at the class or method level can be done in most
by a tester other than the person who wrote the code.
• the term unit testing as the smallest unit that exhibits some functional problem. More comprehensive tests might be appealing because they can test all components at once. However, tests must isolate the target problem and have few
AUGUST 2007
UNIT-TESTING JAVA EE APPLICATIONS
Java EE applications by using mock objects and stubs. Mock objects use special implementations of popular interfaces for testing purposes. These mock objects can be custom classes written for specific tests, or they may be provided by a testing framework. Object-oriented programming allows for the code under test to execute on mock objects as it does on the live objects that are seen in production. Stubs allow specific method calls to be replaced, usually by prepending alternate implementations of classes to the Java class path. With this approach, the tested code can be executed against different dependencies without needing to be recompiled. Code-level testing is easily automated, especially when testing with unexpected inputs. It provides great regression value by identifying specific pieces of code that change functionally. However, when the scenario spans several pieces of code, it’s difficult to represent a use case or problem report in a code-level test. Component-level. Tests for a component can usually be associated with part of the specification. A component is a functional unit of related classes. In Java EE applications, each component may integrate with one or more enterprise frameworks. A specialized framework for testing is needed to effectively test that a component integrates correctly with other enterprise frameworks—without having to set up the entire enterprise system. These testing frameworks usually process application configuration files and provide helpful functions to facilitate testing. The best approach to component-level tests depends on which enterprise frameworks are involved. System-level. An enterprise testing strategy is not complete unless the entire system is started, initialized and tested. This is typically the role of manual QA testing, but many systemlevel tests can be automated. Systemlevel tests exercise the application at the same access points that end users would, and verify the same results that end users would obtain. System tests may also verify internal data at several steps through the process to expedite detection of problems. System testing is often slow, difficult to set up and prone to frequent test failure. Most tests should target more specific components or units of AUGUST 2007
code instead of the whole system. The system level still needs to be addressed, although only a small portion of an enterprise application test suite should be implemented as system tests.
3. Select a Framework Java EE systems employ many frameworks to speed integration with Web
In Java EE applications, each component may integrate with one or more enterprise frameworks.
page, Web service and database technologies. Enterprise frameworks are used to simplify the raw interfaces provided by Sun’s Java EE development kit. A common enterprise solution is to move configuration information from Java API to XML files. Although XML configuration files simplify development, they complicate testing because traditional JUnit works only with Java classes. Most enterprise frameworks provide test utilities to be used in conjunction with JUnit so that the devel-
opment efforts outside individual Java classes can still be tested. Struts. Apache Struts is an open source framework for building servletand JSP-based Web applications. Struts works well with conventional applications as well as with SOAP and AJAX. Apache provides testing frameworks for Struts that mock the Web application server and integrate with the server for testing. The Struts framework simplifies online forms and actions by using simple APIs with an XML configuration file. Web page actions and form data are directed to the appropriate Java code based on data in the Struts configuration file. The mock Struts test framework also uses the same configuration file to emulate the Web application server in an ordinary Java Virtual Machine. As a result, testing Struts applications becomes as easy as specifying the configuration file and context directory once for all tests in the setUp() method. Running mock Struts tests is equally easy. Since the frameworks extend JUnit, tests can be run by any JUnit test runner. The framework supplements JUnit with utility methods to programmatically exercise Struts Web pages and assert results. The same utility methods from the mock framework are available in Apache Cactus Struts Test framework. The difference is that Cactus will deploy tests and run them in a Web application container instead of mocking the container. Tests written using the mock framework can easily be extended to run in the container to test for integration issues. Spring. The Spring framework also incorporates XML configuration files to facilitate an abstraction layer between plain old Java object (POJO) logic and the Web application container. This allows for many scenarios to be tested with traditional JUnit and mock data access objects (DAO) for the service layer. However, some functionality requires integration testing that JUnit cannot handle on its own. A Spring test framework provides a way to test the Java code with respect to the configuration files—without requiring deployment in a container. This is achieved using the springmock.jar file that ships with Spring. You can also run unit tests for Spring applications in a container using www.stpmag.com •
29
UNIT-TESTING JAVA EE APPPLICATIONS
Cactus. Thus, unit testing is feasible for Spring code at the class level, the mock-container level and in a running application server container. This is ideal when creating regression tests for functionality or applying TDD to problem reports. Test cases for Spring can involve as much or as little of the application and surrounding system as needed. Data access objects with Hibernate. Hibernate is object relational mapping (ORM) for persisting data access objects (DAO) in databases. It’s used by the Spring framework and can manage database transactions and provide an abstraction layer for any SQL or JDBC code. Hibernate uses XML mapping files to relate database elements to Java DAO. Spring framework code that uses the DAO can be tested easily by providing mock objects that implement the DAO interface or override calls that would go to the database. You can also test that the mapping files and database transactions are working properly, similar to how the Spring test framework verifies Spring configuration files. The org .springframework.orm.hibernate3 package in spring.jar provides classes for the Hibernate configuration, session and template properties to be configured for testing. This is adequate to detect errors in the mapping XML files, but special care must be taken for the database. Test results may not be repeatable or deterministic when the tests change persisted data in a datab a s e . Fortunately, the test harness can be configured to use a volatile database in memory that won’t be persist-
ed between runs. Hypersonic HSQLDB is a good example of an in-memory database. The database can be initialized with a snapshot of data and later examined to verify if the tests manipulated the data correctly. The database in memory provides the benefits of testing that the data written to it through the Hibernate framework can be retrieved using the same framework—without the consequences of permanently altered data or the risks of deleting important data. System-level testing against the pro-
•
check that the code u n d e r development is integrating correctly with the system in which it’s contained.
4. Cover the Entire Spectrum Techniques for unit testing can be applied to any level of an application. Unit-testing strategies for Java EE applications aren’t complete unless every layer is addressed by the tests. Tests for the top layer exercise smaller units at lower levels, but such tests are highly sensitive to changes and fail easily. These system-level tests should be used sparingly, but a few are essential for ensuring that all components fit together. Component-level tests are often able to tell a story or test a scenario without depending on the entire system. This makes them the best regression tests for verifying code that corrects problem reports or implements feature specification. Code-level tests that focus on one class or method at a time are excellent for pinpointing regression changes, but they’re usually difficult to understand because they lack the context that component-level tests provide. Automated test–generation tools are best suited for creating a codelevel test for every class or method because that amount of test creation is tedious when done manually. JUnit itself is not sufficient to test every layer of a Java EE application. Specialized testing frameworks must be used and matched to the Java EE frameworks used to build the application. Mock objects, configuration processors, synthetic databases and live containers all play a part in a complete Java EE testing solution. Having a test suite that covers the entire spectrum allows application development to proceed with confidence and reliability. ý
Use system-level tests sparingly, but a few are essential for ensuring that all the components fit together.
30
• Software Test & Performance
• duction database that is persisted in the file system is also possible, but it’s usually difficult to set up such a database from a snapshot for every test. The risk that another client may access the data simultaneously during testing adds to the difficulties of testing with file system–persisted databases. Any system-level testing should use a dedicated test database that won’t interfere with valuable live data. Eclipse plug-in development. Even though Eclipse plug-ins aren’t considered to be Java Enterprise applications, Eclipse IDE plug-in development is a good example of using a testing framework that runs units inside a larger application container. Eclipse provides a framework to run JUnit tests as additional plug-ins when launching a graphical workspace. The plug-in tests can programmatically control the Eclipse IDE in a way that visually displays actions as they happen during testing. For example, a plug-in test can use the Eclipse API to import a new project, refactor source code and check for compilation errors. This is yet another example of a framework that extends JUnit to
REFERENCES • www.junit.org • struts.apache.org • www.springframework.org • jakarta.apache.org/cactus • www.hibernate.org • hsqldb.sourceforge.net • www.eclipse.org
AUGUST 2007
Donâ&#x20AC;&#x2122;t Miss Out
On Another Issue of The
Test & QA Report eNewsletter! Each FREE weekly issue includes original articles that interview top thought leaders in software testing and quality trends, best practices and Test/QA methodologies. Get must-read articles that appear only in this eNewsletter!
Subscribe today at www.stpmag.com/tqa To advertise in the Test & QA Report eNewsletter, please call +631-421-4158 x102
32
• Software Test & Performance
AUGUST 2007
The Sec urit y Zone
Secure Software From the Ground Up
Playing the Part of Protector Implementing User RoleBased Security Testing For Enterprise Applications
By Linda Hayes ecurity has become a critical issue at many levels, including access to individual computers, networks, services, applications and accounts. News reports of security breaches are all too common. Testing security has likewise become complex, as intrusion can come through a variety of channels and techniques. Because of their cross-functional capabilities and tight integration, enterprise applications must provide additional levels of security control to specific components and data elements. This level of security is critical to enforcing corporate policies and internal controls, and is a subject of compliance audits. This type of testing, done thoroughly for each change, can be extremely time-consuming—so it’s also an ideal candidate for automation. The key to successful automation is to make the up-front investment to create and support a repeatable, sustainable process. The following is a guideline to organizing, defining and automating your role-based security testing.
S
Photographs Courtesy of The Holy Land Experience, FLA
Understanding Role-Based Testing The most common experience we have with security is controlling initial access to a system or software: Think of a typical log-in page. While this level of security is essential and must be tested against various forms of assault, it’s by no means the most difficult. In contrast, role-based security controls access to specific components or data within a system based on the rights of the user. Unlike the all-or-nothing rights of the log-in, this level of security may hide menu options, fields or screens from users based on their role, or may permit or deny reading or writing to particular data elements. Consider a payroll system. Most employees probably have no access rights at all, while payroll clerks may be authorized to use the system to enter time-card information but are likely prohibited from perusing salaries, making changes to rates and benefits, and adding or terminating employees. Supervisors, on the other hand, may be able to manage benefits, but only a senior manager may be able to see compensation information and add or terminate employees. In other words, various capabilities of the application are selectively exposed based on the role of the user. These restrictions are obvious within the context of a single application. Less obvious and far more complex are the roles within an enterprise resource planning (ERP) system involving multiple interrelated applications that enable end-to-end business processes such as order-to-cash and procure-to-pay. In this environment, a transaction created in one application may have far-reaching consequences in downstream systems. Linda Hayes is the CTO of Worksoft and the founder of three software companies, including AutoTester, which delivered the first PC-based test automation tool. www.stpmag.com •
33
The Sec urit y Zone For example, a manufacturing plant might have minimum staffing levels specified for each position within each shift, and if the minimum isn’t met, the plant isn’t deemed available for production scheduling. In this situation, a payroll clerk with authority to change the department code for a plant employee could inadvertently shut down the plant by violating the minimum staffing requirements. Because of the potentially widespread impact of each activity within such a tightly integrated environment, role-based security testing is as essential as it is challenging.
Defining User Roles and Rights Testing role-based security involves the verification that user roles and rights are enforced by the software, so the natural foundation of your test effort is the definition of these roles and rights. Many commercial ERP systems provide configurable role-based security settings that are discoverable, usually in the form of a matrix of roles, functionality and permissions. Customized or internally developed systems, or those that integrate with disparate legacy applications, may require additional effort to uncover this information. By whatever means acquired, the goal is a matrix of user roles, functionality and permissions. But you can’t stop there. Simply accepting the definitions as they exist wouldn’t satisfy the purpose of testing, which is to be sure that they’re both correctly expressed and properly enforced. Your next step is to validate that the settings comply with corporate policy. This may be as simple as obtaining signoff from your corporate security officer or as involved as interviewing owners of application or functional areas or reviewing corporate policy handbooks. Once you have a valid set of definitions, you’re ready to test the system’s compliance. Unfortunately, that’s not as easy as systematically testing and checking off the boxes in the matrix. This is because few systems allow flat navigation access to all components. In many cases, multiple navigational steps are required to expose a particular function, window or field. Further, this navigation likely requires data values that permit or trigger navigation or functions. This means that your test plan and test cases must encompass a more holistic view of the enterprise and the application, encompassing the business processes and supporting environment.
•
some processes may be linked to more than one role. There are also end-to-end business processes, such as orderto-cash, that span application components and user roles, and these must be incorporated into any comprehensive test strategy because of their integration impact. Without a test plan that mimics the interrelationships among users, processes and data, comprehensive coverage can’t be achieved. The definition of user roles and rights is typically organized around business processes in ERP systems instead of discrete components. Each role represents the profile of a particular job description. Again, the human resources department may have data entry clerks, supervisors, managers and so forth, each authorized to carry out specific business processes. The set of rights is then organized to enable these processes for each role and prohibit others. When testing, it’s equally important to ensure permission as prohibition. Most of us think of security in terms of denying access to unauthorized users, but it’s perhaps even more critical to permit it to authorized ones. While exposing data or capabilities to the wrong user may violate internal control policies, refusing necessary access may prevent one or many employees from getting their jobs done. For example, a warehouse inventory system that correctly enforces an internal policy that prevents shipping clerks from logging breakage or spoilage but also prevents the supervisor or manager role from doing so could result in incorrect inventory levels that cause supply-chain, ordering and fulfillment errors. Therefore, your test plan should include both positive and negative tests to ensure that users can perform their assigned business processes—and nothing else.
A transaction created in one application may have far-reaching consequences in downstream systems.
Business Processes Business processes represent the activities performed by various users of an enterprise application. These are scenarios or workflows that describe a path through the application to accomplish a particular task, such as entering a product order or shipping goods. Each role is associated with a set of processes, and
34
• Software Test & Performance
•
The Model Enterprise The next step is to define the environment in which these business processes are executed. This is the foundation of any test effort: the set of software, data, activities and platform. In some industries it’s common to establish a “model office” for user acceptance testing to house this environment. I’ve seen banks actually construct physical model branches where they test hardware and software and provide training. Users representing each role reported for “work” at the model office and performed their usual daily or weekly tasks. A processing schedule was defined that mimicked the production environment, including overnight batch sessions. This “sandbox” branch was used to validate the integrated set of systems for each branch before they were rolled out to hundreds or thousands of locations. This model has appeal but needs to be updated. Enterprise applications require true enterprise-level testing. Instead of a model office, you need a model enterprise. AUGUST 2007
The Sec urit y Zone The model enterprise is a controlled and repeatable set of hardware, software and data, including users, business processes and job schedules, that supports the necessary endto-end business process flows for enterprise applications. You should plan for this effort to be the most challenging of the entire test endeavor and consume 50 percent or more of the overall effort, especially the data. Integration within enterprise applications as well as with external legacy systems requires that data conditions be orchestrated across multiple databases, files, services and interfaces.
Data Strategies For a test process to be effective and efficient—whether manual or automated—predictability is essential. And predictability requires control of the data, otherwise testers spend the majority of their time looking for valid data or trying to diagnose failures from data issues. Predictability is also crucial to reusability. For tests to be reused, either the same data must be restored or procedures must exist to replenish or extend the data. For example, without a plan to add new inventory, repeated execution of an order-to-cash business process will deplete inventory until eventually orders can’t be filled. Although archiving and refreshing the entire environment every time would be ideal, it’s not always practical or even feasible. And while it’s tempting to simply make a copy of production in the name of creating a realistic test bed, this may violate confidentiality and security by exposing the actual data of real persons. Any data strategy that involves production data must include a procedure for scrubbing or scrambling data values to obfuscate sensitive information. If automation is available, it may be more cost effective to load the needed data instead of trying to store it. Or, a mixture may be adopted: The environment may be restored to a certain state and then the needed data is added.
Test Environment Permissions Another important element of this environment is, of course, the availability of log-ins that represent each role. While testers themselves aren’t payroll supervisors, for example, they must have the same log-in rights if they’re to test the security rules. Often organizations may balk at giving testers role log-ins deemed sensitive or high risk. But if the test environment and data are properly managed, there should be no opportunity for exposure. The other extreme—giving testers superuser rights to everything—won’t work either, since it limits their ability to test the variations in rights among roles.
The Enterprise Calendar All of the above factors—uses, processes and environment—must be organized into a schedule to coordinate the end-to-end business processes. Some dependencies are easy: You can’t terminate AUGUST 2007
an employee who hasn’t been hired yet. Others are less so: You can’t ship goods from inventory if they haven’t been produced yet, and you can’t invoice an order that hasn’t been shipped. The easiest approach is to construct a calendar of virtual days, weeks and even months or other intervals, and plan the user roles and processes within it. If the system functionality allows it, multiple virtual days could actually be executed in a single actual day, but in other cases, this may not be possible without adjusting the system date and time. For example, an insurance policy may have to be in effect for a stated period of time before a claim can be filed. While this is a simple concept, implementing it in reality can be quite complex. Tightly integrated enterprise applications may reflect updates instantly, while others may require batch jobs to be processed before changes are reflected. Once all this is in place, you’re ready to deliver robust role-based security testing.
Prove It! In these regulated times, it’s imperative not only to perform role-based security testing, but to document the results. While manual testers can complete check lists and capture screens, this is time-consuming, often random and hard to manage. Automated tests, on the other hand, can provide a consistent, detailed audit trail of both actual and expected results. Take the extra time to designate a central location, if not a repository, for test results so that they can be shared with developers and mined for audit. Whether manual or automated, the end game is to prove that each element of the role/function/rights matrix is tested and any exceptions captured.
The Obvious Question By now you should be wondering how long all this will take and whether you have the time and resources to dedicate to it. The answer is probably no, at least as a stand-alone exercise. On the other hand, if role-based security testing is integrated with businessprocess functional testing, the incremental effort may be justifiable. Functional and user acceptance testing will likely require a comprehensive test environment as well, and their testing of end-to-end business processes will likely touch the same areas. Simply by adding additional verification points to test cases for permitted business processes and adding negative tests for processes that should be prohibited, functional and security testing can both be accomplished at the same time. Security challenges confront corporations at every front. At the enterprise level, with multiple uses and applications interacting, these take on even more significance and complexity. An orderly and thorough yet efficient test strategy is necessary to validate user role-based security, specifically one organized around the enterprise business process and operational model. ý www.stpmag.com •
35
Best Prac t ices
The Key to Testing With Eclipse? Community. Eclipse is growing up. The post my initial question, upstart open-source, platMark Dexter chimes in form-independent software with a pointer to a great framework turns seven in screencam video on testjust three months. That’s driven development with old enough to have moved Eclipse. (Google “test first beyond vaporware and development using Eclipse, breathless hype into a slew screencam” to find the of commercial products. video.) And it’s not just IBM— “Eclipse has several feawhich donated Eclipse to tures that help streamline Geoff Koch the open source communi[test-driven development, ty in 2001. Today, Adobe, Intel, Nokia, such as] automatically creating empty Sybase and scores of smaller companies methods [and] letting you run a proship Eclipse-based software. gram that contains compilation errors,” As any parent can tell you, part of writes Dexter, co-founder of a Seattlegetting older is learning the basics of based ERP company, Dexter+Chaney, responsibility. You know, eat your vegthat serves the construction industry. etables before dessert, brush your teeth “I’m very pleased with all the before bed, and so forth. Software automation Eclipse allows, such as filldevelopers may not necessarily follow ing out try/catch blocks and suggesting this advice. But hygiene habits aside, method names as you type,” adds good coders quickly learn that all sucMichael Donohue, the team lead for cessful programming projects, includJava development at San Francisco– ing those utilizing Eclipse, depend on based Coverity, which released its sound testing practices. Among these: Prevent Desktop for Eclipse product at pay attention, no matter how easy it gets the JavaOne Conference in May. “One to practice automated, point-and-click gotcha is keeping your mind engaged. development; define and publish clean Since Eclipse does a pretty good job of APIs; and most important, always particmaking suggestions, it’s easy to stop ipate constructively in your community thinking about what comes out.” of peer developers. I could have researched this entire Who better to discuss testing than column by talking to small shops hangmembers of the Eclipse community ing out in Eclipse newsgroups. However, itself? After all, one of the wonderful some of the biggest players in tech also things about the community, at least contribute serious resources to the projfrom a tech journalist’s point of view, is ect, from programming talent to that it’s relatively open and accessible, Website–infrastructure support. So it provided you understand the basic makes sense to also hear from a few of norms of open source networks. Hint: these more established players, several authenticity and simple, concise quesof which are placing big bets on their tions are always preferable to pomposity own Eclipse-based projects. and long-winded commentaries. IBM Says API Researching this month’s column, I With a clutch of products under its drop into the eclipse.newcomer newsWebSphere, Lotus and Rational brands group at Eclipse.org to get some preincorporating Eclipse, IBM is a logical liminary guidance on testing with the first stop. What does Big Blue consider open source framework. One day after I
36
• Software Test & Performance
to be the three most important focus areas for testing in Eclipse? “API, API, API,” says Harm Sluiman, an IBM Distinguished Engineer in Ontario, Canada. Even down to the functionality provided by its core Rich Client Platform, Eclipse can be thought of as one kludged-together mass of plug-ins. This architecture is associated with several technical advantages, including a relative ease in extending Eclipse to nonJava languages, such as C and Python. “Other integrated development environments have opened their APIs much later, and therefore don’t have nearly as many open source plug-ins or commercial plug-ins extending their testing or other functionality,” says Nada daVeiga, a Los Angeles–based Parasoft product manager for Jtest, available in its full functionality as an Eclipse plug-in. However, the highly modular Eclipse architecture also puts a premium on defining clean APIs, which should only be done after strong collaboration with consumers, Sluiman believes. “When components come together and the API usage is not clean, there is a high probability that there will be problems, especially from driver to driver, which is sure to disrupt any test effort,” says Sluiman, who’s on the project management committee of the Eclipse Test and Performance Tools Platform (TPTP) project. “Stabilization of a new API is a key milestone to track, and this should come before or early in the integration testing cycles.”
Log It in Bugzilla, Stay Engaged In its early days, Eclipse was an IBM-only experiment, but today, close to 20 comGeoff Koch writes about science and technology from Lansing, Mich. Write to him with your favorite acronym, tech or otherwise, at koch.geoff@gmail.com. AUGUST 2007
Best Practices panies are signed on to the project as gold-plated supporters of the Eclipse Foundation. Behind their fancy title— the companies are declared to be Strategic Members—is a commitment to provide developer resources and membership dues that can be as high as US$500,000 annually. For that kind of dough, you might think these firms get special treatment. But that’s not necessarily the case, says Bill Roth, vice president of Workshop products at BEA Systems. “We are consumers of the Eclipse platform,” says Roth. “We perform customer acceptance testing of every build we take from Eclipse and then perform integration testing of our plug-ins.” When BEA uncovers an Eclipse bug, the company’s engineers log it in Bugzilla, just like anyone else in the Eclipse community. The key to getting the bug resolved, says BEA test engineer Rebecca Weinhold, is staying involved and providing feedback. Weinhold uncovered squirrelly behav-
ior in the Eclipse Web Tools Platform project while debugging JavaServer Pages in a BEA product. The Eclipse developer who responded to her Bugzilla report said the code appeared to be working as designed, but asked for Weinhold’s opinion as to how it could be improved. “The bug might’ve been dropped at that point if I hadn’t responded with some feedback to make my case,” she says. “After a few exchanges, with comments from other Eclipse users, we came up with an improved user experience, which the developer agreed to implement and I promised to test.” Weinhold’s experience points out a last thought on Eclipse worth mentioning. Namely, as any child prodigy eventually learns, while it’s great to be smart for your age, you eventually have to figure out how to work constructively with your peers. Certainly, the expanding functionality of Eclipse and its pool of plug-ins is noteworthy. But more impressive is the coordinated teamwork displayed by the
Eclipse community. “The main issues in software quality, like the toughest issues in software development, are process issues, not technical issues,” says Jeff Bocarsly, a Columbia Ph.D. and vice president of RTTS, a New York City–based software consulting firm. “The human problems of estimating the work effort required for a creative enterprise, organizing the work efficiently, communicating clearly between all stakeholders, reporting progress accurately, dealing with schedule and delivery adjustments during the work and documenting work properly are always the toughest to solve.” Bocarsly’s comments provide some context for the collective achievement of the Eclipse community: successfully delivering two major releases, Callisto and Europa, each with what Sluiman estimates to be tens of millions of lines of tested code. Not bad for a seven-year-old. Now, just wait until Eclipse is old enough to û start driving. ¨
Index to Advertisers Advertiser
URL
Agitar
www.agitar.com/agile
Bredex
www.bredexsw.com
17
Eclipse World
www.eclipseworld.net
39
Hewlett-Packard
www.hp.com/go/software
40
IBM
www.ibm.com/takebackcontrol/innovate
iTKO
www.itko.com/lisa
Seapine Software
www.seapine.com/stptcm
4
Software Test & Performance
www.stpmag.com
8
Software Test & Performance Conference
www.stpcon.com
2-3
Software Test & Performance White Papers
www.stpmag.com
32
Test & QA Report
www.stpmag.com/tqa
31
AUGUST 2007
Page Number 6
20-21 25
www.stpmag.com •
37
Future Future Test
Test
“The App Is Down!” application code, they’re Those fateful words, too not dealing with the critical often spoken at five o’clock tasks needed to keep the IT on a Friday or over the shop, network and infraphone late at night, can structure in tip-top shape. destroy a weekend or even Unfortunately, the intea career, depending on gration (or middleware) how often they’re heard market around server-side and from whom. Java (and .NET to a deWith today’s increasinggree) has spawned a lack of ly complex applications, organizational focus that it’s hard to effectively hanJason Donahue perpetuates a cycle of nondle the challenges that all accountability. Operations teams, feartoo frequently crop up. Regardless of ing they don’t have the skills to monitor the technology in play in your enterand manage these applications, refuse prise, you might as well just say, “Help to be responsible for performance. me!” because application management That means the development and/or tools haven’t kept pace with the changes architecture teams must always be that new infrastructure products and ready for any issues that occur. concepts are delivering. Customers have told me that well over The common thread in new applica80 percent of the production problems tion infrastructure like portals and SOA that the development team works on is to put more and more logic inside the are not code-related, which leads to black box. These levels of indirection bridge calls and finger-pointing. make it difficult, if not impossible, for operations staff to manage application Automation Is the Answer performance and availability. These However, there is a way of dealing with excess costs, budget slips and lack of all these problems: Stop the cycle of control lead ultimately to a poor cusorganizational defocusing and improve tomer experience. your customer’s experience by automatThe key is understanding what an ing the management of your applicaapplication’s makeup (at a deep level) tion’s performance. Expert systems looks like, and correlating its compoaren’t new—they exist in several specialnents to the overlying business processty applications, but in few management es that use them. tools. While many system management Keeping in Focus frameworks are starting to incorporate Focus is important. Keep teams focused rules modeling engines, they haven’t on particular functions. You don’t want caught on—mainly due to the excessive developers trying to figure out sales manual processes needed to create the processes or the sales team writing Web rules and model scripts. To take advantage of expert systems, content. rules engines and models, the ideal platFocus is important in the operations form would be entirely automated. This side, too. Application developers spend must include the detection of applicaanywhere from 25 to 100 percent of their tion change and the reflection of that time helping manage production applichange in the monitored view. cations. Similarly, when operations teams This is where all of those nasty black learn how to program and/or architect
38
• Software Test & Performance
boxes of Java, .NET, Web services and SOA can actually end up helping. Since these are all essentially frameworks, software companies can automate the analysis of applications built on top of these frameworks (and platforms). This is because the frameworks and platforms themselves are based on standards, apply a set of common rules, auto-generate a model, and run an expert system to pick monitoring points and generate performance dashboards to automate the entire process. Complete automation through application modeling allows organizations to automatically perform the three key tasks of performance management: • Setup and configuration of the application performance–management environment • Correlation and analysis • Change management An expert system for automated monitoring using a model- and rulesbased approach allows operations teams to monitor the right application details, specifically those tied to important business processes. It cuts the umbilical cord to the development/architecture teams because their expertise is no longer needed to look at every problem. That means developers stay focused on developing new applications, bringing budgets back in line and projects back on time. A further benefit is a consistency of monitoring quality across all applications, meaning that customer experience—and ultimately satisfaction—will be maximized across the entire portfolio. As application management becomes more complex, traditional automation and system management tools just aren’t going to cut it, particularly for complex composite applications leveraging frameworks such as portal or integration. But by understanding what your application’s made of, keeping project roles in focus, and using new platforms using application modeling and automation, your Friday afternoon exclamation might change from “Help me!” to “Have û a great weekend!” ¨ Jason Donahue is president and CEO of ClearApp, which makes application performance–monitoring tools. AUGUST 2007
SUPER EARLY-BIRD DISCOUNT! REGISTER BY AUG. 31 – SAVE $350!!
PUT ECLIPSE TO WORK! LEARN HOW TO BUILD BETTER SOFTWARE USING ECLIPSE! CHOOSE FROM MORE THAN
70
WRITE BETTER SOFTWARE
CLASSES!
by leveraging Eclipse’s features
HYATT REGENCY RESTON
GO BEYOND THE FREE IDE
RESTON, VA
with Eclipse add-ons and plugins
NOVEMBER 6–8, 2007
LEVERAGE CODE REUSE with the Eclipse Rich Client Platform (RCP)
SAVE TIME AND MONEY with proven productivity tips
GET THE INSIDE TRACK on the hot new Eclipse 3.3 and Europa code releases
MOVE INTO THE FUTURE
KEYNOTES ANNOUNCED!
with AJAX, Web 2.0 and SOA
LEARN FROM THE BEST experts in the Eclipse community
BECOME AN ECLIPSE MASTER DAVID INTERSIMONE
Platinum Sponsor
ROBERT MARTIN
Gold Sponsors
by taking classes grounded in real-world experience Silver Sponsor
WASHINGTON D.C. AREA!
Produced by BZ Media is an Associate Member of the Eclipse Foundation.
www.eclipseworld.net A BZ Media Event
For sponsorship opportunities or exhibiting information, contact Donna Esposito at 415 -785-3419 or desposito@bzmedia.com.
Make sure your critical applications are never in critical condition. Weâ&#x20AC;&#x2122;ve turned I.T. on its head by focusing on I.T. solutions that drive your business. What does this mean for Quality Management? It means efficiency that results in shorter cycle times and reduced risk to your company. It also means you can go live with confidence knowing that HP Quality Management
upgrades. Find out how. Visit www.hp.com/go/software. Technology for better business outcomes.
Š2007 Hewlett-Packard Development Company, L.P.
software has helped thousands of customers achieve the highest quality application deployments and