stp-2006-10

Page 1

A

Publication

: ST ES g BE CTIC inin A ra PR m T Tea

VOLUME 3 • ISSUE 10 • OCTOBER 2006 • $8.95 • www.stpmag.com

Digging Into Root Causes J2EE Application Performance Woes Brought to Light Findbugs Helps With Report Generation Build a Quality-Driven Development Process

Thirteen Steps To Bulletproofing Your Automation Scripts


Ship Software OnTime.

" / ÓääÈ The Fast & Scalable Team Solution for... Defect & Issue Tracking • Feature & Change Tracking • Task & To-do List Tracking • Helpdesk Ticket Tracking OnTime is the market-leading project, defect and feature management tool for agile software development and test teams. OnTime facilitates tracking, analyzing and trending team-based software development efforts in an intuitive and powerful user interface. A fully customizable UI, powerful workflow, process enforcements, two-way email communications and custom reports combine to help software development teams ship software on-time!

Available for Windows, Web & VS.NET 2003/2005 OnTime 2006 Professional Edition

OnTime 2006 Small Team Edition

• For Teams of 10 to 1,000 Members • From $149 Per User

• • • •

For Teams up to 10 Members Free Single-User Installations $495 for 5-Team Members $995 for 10-Team Members

800·653·0024 software for software development™

www.axosoft.com

Only $495 for up to 5 Users • Only $995 for up to 10 Users Free Single-User Installations


4UPQ CVHT CFGPSF UIFZ TUBSU +UFTU BVUPNBUFT UIF WBMVBCMF ZFU PGUFO UFEJPVT UBTL PG DPEF SFWJFXT BOE VOJU UFTUJOH BMMPXJOH EFWFMPQNFOU UFBNT UP BEPQU B iUFTU BT ZPV HPw TUSBUFHZ UIBU QSPNPUFT UFTUJOH DPEF BT JU T EFWFMPQFE TP UIBU RVBMJUZ JT CVJMU JOUP UIF DPEF GSPN JUT JODFQUJPO BOE CVHT BSF FMJNJOBUFE CFGPSF UIFZ JOGFDU UIF SFTU PG UIF DPEF CBTF 5IF SFTVMU $MFBOFS TBGFS NPSF SFMJBCMF BOE DPOTJTUFOU DPEF 3FEVDFE DPEF SFXPSL 'BTUFS BOE NPSF QSFEJDUBCMF SFMFBTF DZDMFT SFMJBCMF BQQMJDBUJPOT BOE IBQQJFS NPSF QSPEVDUJWF FOE VTFST 'PS NPSF JOGPSNBUJPO PO 1BSBTPGU +UFTU DBMM FYU PS HP UP XXX QBSBTPGU DPN +UFTU



VOLUME 3 • ISSUE 10 • OCTOBER 2006

Contents

12

A

Publication

COV ER STORY You Can Root Out J2EE Performance Problems

Get your hands dirty! Java 2 Platform, Enterprise Edition application issues are often buried deep in the code. By Edie Lahav

16

Shining Light On Java Code

Static analysis tools aid in delivering on-time code. By Alan Berg

Depar t ments

24

7 • Editorial

Get Quality Into Your Head

All things change—from editors to solar systems. By Lindsey Vereen

This six-stage template can lead you to a successful quality-driven process. By Nigel Cheshire

8 • Out of the Box

29

Automation Scripts in 13 Steps

Are your automation scripts too fragile? Go step by step to help make them bulletproof. By José Ramón Martínez

New products for developers and Compiled by Alex Handy testers.

10 • Peak Performance Use your numerical know-how to interpret statistical data. By Scott Barber

35 • Best Practices In pedagogy, the personal touch adds an essential element. By Geoff Koch

38 • Future Test How to address the challenge of application manageability with health By Mike Curreri models. OCTOBER 2006

www.stpmag.com •

5


CodePro AnalytiX™ is developed by the experts who brought you the popular book Eclipse: Building Commercial Quality Plugins — Eric Clayberg & Dan Rubel

CodePro AnalytiX for Eclipse, Rational® and WebSphere®

Drive out quality problems earlier in the development process. Powerful code audits and metrics make every developer a Java expert.

Create tight, maintainable systems. Wizards generate unit and regression tests, and analyze testing code coverage.

Enforce quality measures across teams. Define, distribute, and enforce corporate coding standards and quality measures, across individuals and teams… no matter where they’re located.

Key Features of CodePro AnalytiX Y Detect & correct code quality issues... automatically Y Define, distribute & enforce quality standards across development teams Y 800+ audit rules and metrics, 350+ Quick Fixes Y Powerful management reporting Y Code metrics with drilldown & triggers Y Audit Java, JSP and XML files Y JUnit test case generation

Spend less time and money to develop high-performance Java systems.

Y Code coverage analysis of test cases

Dramatically improve your application development process and code quality… leading to more reliable, maintainable systems.

Y Integrated team collaboration

the development tools experts

www.instantiations.com 1-800-808-3737 No one outside of IBM has more experience creating Eclipse-based Java development tools

Y Dependency analysis & reporting Y Javadoc analysis & repair Y Seamless integration with Eclipse, Rational® and WebSphere® Studio

© Copyright 2006 Instantiations, Inc. CodePro AnalytiX and CodePro are trademarks of Instantiations.

All other trademarks mentioned are the property of their respective owners.


A MESSAGE FROM THE EDITOR

VOLUME 3 • ISSUE 10 • OCTOBER 2006 EDITORIAL Editor

Contributing Editors

Lindsey Vereen +1-415-412-4314 lvereen@bzmedia.com

Scott Barber sbarber@perftestplus.com Geoff Koch koch.geoff@gmail.com

Senior Editor

Alex Handy ahandy@bzmedia.com

Editorial Director

Alan Zeichick +1-650-359-4763 alan@bzmedia.com

Copy Editor

Laurie O’Connell

ART & PRODUCTION Art Director

Art /Production Assistant

LuAnn T. Palazzo

Erin Broadhurst SALES & MARKETING

Publisher

List Services

Ted Bahr +1-631-421-4158 x101 ted@bzmedia.com

Nyla Moshlak +1-631-421-4158 x124 nmoshlak@bzmedia.com

Advertising Sales Manager

Reprints

David Karp +1-631-421-4158 x102 dkarp@bzmedia.com

Lisa Abelson +1-516-379-7097 labelson@bzmedia.com

Advertising Traffic

Accounting

Phyllis Oakes +1-631-421-4158 x115 poakes@bzmedia.com

Viena Isaray +1-631-421-4158 x110 visaray@bzmedia.com

Marketing Manager

Marilyn Daly +1-631-421-4158 x118 mdaly@bzmedia.com READER SERVICE Director of Circulation

Customer Service/

Agnes Vanek +1-631-421-4158 x111 avanek@bzmedia.com

Subscriptions

+1-847-763-9692 stpmag@halldata.com

Cover Photograph by Peter Chadwick

President Ted Bahr Executive Vice President Alan Zeichick

BZ Media LLC 7 High Street, Suite 407 Huntington, NY 11743 +1-631-421-4158 fax +1-631-421-4130 www.bzmedia.com info@bzmedia.com

Software Test & Performance (ISSN #1548-3460, USPS #78) is published 12 times a year by BZ Media LLC, 7 High Street, Suite 407, Huntington, NY 11743. Periodicals privileges pending at Huntington, NY and additional offices. POSTMASTER: Send address changes to BZ Media, 7 High Street, Suite 407, Huntington, NY 11743. Ride along is included. ©2006 BZ Media LLC. All rights reserved. Software Test & Performance is a registered trademark of BZ Media LLC.

OCTOBER 2006

Omnia Mutanter Times. My first encounter Once upon a time, way with Eddie took place sevback in the last century, a eral years ago at an indusyear and a day were try dinner in San Francisco thought to be the same when I was working for length on the planet another publishing comMercury: 88 Earth days. pany. I remember being Thanks to improved measimpressed with his insight urement techniques, we and industry knowledge. now know that while the Later, knowing his name, I planet does still revolve read his articles in SD around the sun every 88 Lindsey Vereen Times and grew further days, it rotates on its axis Editor impressed. When I joined once every 59 days. BZ Media, the company that publishes Back when a day and a year on both SD Times and Software Test & Mercury were the same length, it was Performance, I was happy for the one of nine planets in our solar system. opportunity to become better acquaintBut it seems Pluto has now been ed with him. drummed out of the planetary corps, thanks to its insufficient sphericity and Curriculum Vitae a failure to clear its neighborhood, makEddie came to SD Times in 2000 and ing Mercury a sibling in a family of only helped with its launch in February of eight planets. Once Mercury was the that year. He covered several beats, second smallest planet next to Pluto, including integration and SOA, but with the disqualification of Pluto, databases, XML and Web services, it has gained the distinction of being and embedded and wireless developthe smallest. ment. The decision to blackball Pluto is not He’d previously worked as a freeimmutable. A number of scientists are lancer with several publications, includup in arms about it, and perhaps the ing Network Computing and Unix decision will one day be rescinded. Today. Then he led the development Ipso Facto team that built a publishing company’s As anyone who has ever tried to test the first centralized editorial repository and continually moving target known as its wire-service storage and distribution enterprise software understands, an system. event like the Plutonic expulsion simLater, Eddie took over as the first ply highlights the point that everything technical editor of CRN’s Test Center, that is not true a priori is subject to which was just being built. He spent change. And astronomy is a field in four years with CRN building the Test which such changes are commonplace. Center team, developing performance Jupiter, for example, has way more tests for computer software and hardmoons now than it did when I was a lad. ware, conducting the comparative tests Speaking of changes, some are brewand reporting the results. ing right here at Software Test & Performance. After two years at the Post Hoc helm of this magazine, I have decided And so I leave Software Test & to move on. Next month you’ll see a Performance under Eddie’s extremely new face at the top of this page—the capable guidance, off to relearn everyface belonging to Edward J. Correia. thing I ever thought I knew about the You may know that name from SD solar system. ý www.stpmag.com •

7


Out of t he Box Compiled by Alex Handy

TestNG 5.0 Smoothes Out Annotations TestNG has been steadily receiving praise for the past three years, thanks to its ability to manage and group large numbers of tests within an application. It’s the grouping of tests that sets TestNG appart from JUnit, and it’s also the feature that’s most markedly improved in version 5.0 of the tool. Cédric Beust, a software engineer at Google, began developing TestNG three years ago when he decided that JUnit was too limited in scope for proper enterprise use. “The major feature [in this release] is not brand new; it’s about more renaming and cleanup of annotation names to make them more intuitive. Our reports are easier to read and better organized now. As we have more and more users who have thousands of tests and dozens of groups, it becomes really important to make those reports easy to read,” said Beust, who is originally from France. TestNG offers charts and graphs beyond bug Most of the work on the project has occurences. This defect-tracking system can also been done exclusively by Beust and chart workflow and planning phases. Alexandru Popescu, who began conIn late July, TestNG received a major update that added support for organizing and categorizing thousands of unit tests within an application. While JUnit is the most popular unit testing tool for Java,

tributing code soon after Beust created TestNG. Other developers have built plugins for the tool, but Beust and Popescu have remained the primary contributors. For the future, Beust hopes to see more external contributions, but doesn’t expect to do much work on the core of TestNG. “For the past year, there were less features requested in the core and we were working on productivity around the core, which is good because it means the core is working and functional enough,” said Beust of the 600-strong mailing list for the tool. “I think we’re going to see more stronger integration with Web servers so we can drive TestNG from remote machines. There’s also a work in progress to have a distributed version so we can have distributed tests. The general message is going to be more about scaling. For people writing thousands of tests, we want to make it almost transparent for them to use as many machines and as much power as they have.” TestNG is a free tool, and can be downloaded at www.testng.com.

Java App GUIdancer 1.1 Taps Into Monitor API Town Hits 2.2 JAMon is a simple and free API that gives developers a window into the performance of their production applications. It can be used during testing or deployment to track statistics and information related to an application’s behavior and speed. New to this version is JDBC support and a handful of bug fixes. The JAMon API is quick and simple to install. Simply place the JAMon.jar file in your class path, then insert the API’s start and stop commands around the code you wish to monitor. The tool can be called multiple times in a single application, and requires no administration privileges to run, nor to view the data collected. JAMon is available under a BSD license and can be downloaded at freshmeat.net/projects/jamonapi.

8

• Software Test & Performance

The newest version of GUIdancer adds Ant task support and a command-line client for added batch capabilities. This Swing and Java GUI testing program is available now from Bredex, GmbH at www.gui dancer.com. “The past few months have seen considerable developments for GUI dancer,” said Hans-J. GUIdancer automatically tiptoes through your Swing apps to Brede, managing direc- find glitches and inconsistencies. tor of Bredex, GmbH. for which yet more features are in devel“As well as improving the general usabilopment.” ity of the tool, we’ve been working on GUIdancer can be downloaded and offering more options for testers batch purchased online. A single user license testing, a larger choice of actions, and costs around $1,200, while a multi-user translation of test data, for example. This license costs around $5,000. preview is a taste of the full 1.1 version, OCTOBER 2006


Selenium Still Growing The open-source testing framework known as Selenium Core has reached version 0.7.1 as of this writing. Selenium Core is a test tool for Web applications. Selenium Core tests run directly in a browser: Internet Explorer, Mozilla and Firefox on Windows, Linux or Macintosh. Written in pure JavaScript/DHTML, Selenium Core tests are copied directly into your application Web server, allowing them to run in any supported browser on the client side. Selenium Core is open source software and can be downloaded and used without charge. The tool is designed to make regression and acceptance testing of Web sites easier. The project has sparked a number of offshoots, including the Selenium IDE, a Firefox-based Selenium IDE allows users to record GUI tasks for later playtool that turns a browser into back as tests. Selenese. But beyond that, the Selenium an automated acceptance test machine. IDE offers a far simpler way to write tests. Selenium Remote Control and Selenium Users of the Selenium IDE can record and on Rails have also made significant progress play back mouse- and keyboard-driven tests this fall. within Firefox, then automate them for playBlogger Scott McPhee wrote, of back. Tests can then be exported as Ruby Selenium, “What’s cool about it is that it understands the dreaded JavaScript. It actuscripts, and then automated and tied into a nightly barrage. ally drives your local browser via a very clever While none of these projects have yet set of JavaScript functions... It’s pretty raw reached general release, they’re no less useat the moment, only young, but I think in ful at present. All of these Selenium proja short while it will show some definite promise.” ects can be found at www.openqa.org/ All Selenium tests are written in a proselenium. The tools are free and open prietary scripting language known as source.

Erratum: Corrected table from Yuri Chernak's "Bringing Logic Into Play" (Aug. 2006). TABLE 2: DERIVING A FEATURE PASS CONCLUSION Disjunctive Syllogism Form 1. Either P or Q is true–means 2. P is not true–means 3. Then Q is true–means

OCTOBER 2006

Testing Argument Form 1. After all test cases have been executed, a feature status can be either fail (P) or pass (Q) (implication). 2. We know that the feature did not fail the test for all of its test cases (evidence). 3. Then the feature passes the test (conclusion).

Watir Flows Over Web Apps This fall, version 1.5 of the Web Application Testing in Ruby project will be released. Watir, a Ruby library that works with Internet Explorer on Windows, is a free, open-source functional testing tool for automating browser-based tests of Web applications. Ruby provides the ability connect to databases, read data files, export XML and structure code into reusable libraries. Watir is designed to stomp all over Web applications, chasing links and validating forms. On the Watir wiki, one user codified his reasons for using Watir. “SilkTest may be good for casual automation. However, I’m talking about automating hundreds and hundreds of highly complex test scenarios on a large Web application. The automated test cases will be maintained by different people, be run for multiple releases over many years. I just cannot imagine using SilkTest without incurring huge cost... “Using Watir + IE Developer Toolbar + SpySmith, writing automation is very enjoyable and efficient. We are able to write very robust and concise test cases, such as automating drag-and-drop without knowing how such behavior is implemented. We can strictly follow DRY principle and share our assets. We can even embed powerful debugging tools such as Ruby’s breakpoint without asking testers to write one extra line.” The user, who posted under the name Dannyy, used SilkTest 8.0 for his comparison. Watir is free, available at openqa .org/watir. As of this writing, version 1.5 was in full development and nearing release. By the time you read this, it should be available. To use Watir, Windows users must first install Ruby. Send product announcements to stpnews@bzmedia.com www.stpmag.com •

9


Pea k Perfor mance

Performance Testing Plus: Do the Math! If you’re like me, you barely monly used and most comsqueaked by in whatever monly misunderstood statismath class you took last. If tic of them all. Just add up all you’re like two of the best prothe numbers and divide by grammers I’ve ever worked how many numbers you just with, you failed your last math added—what could be simclass miserably, dropped out pler? What most folks don’t of school and got a job writrealize is that if the average ing code. Or maybe you enjoy of 100 measurements is 4, math and statistics, in which that could imply one quarter case I’m happy for you and of those measurements are 3, Scott Barber encourage you to put that half are 4 and another quarenjoyment to practical use when designing ter are 5 (we’ll call this data set A)—or it and reporting on software tests. could mean that 80 of those measurements Whatever your particular situation is, are 1 and the rest are 16 (data set B). If I’m starting to feel like a math and stawe’re talking about response times, those tistics teacher. As a whole, it seems to me two sets of data have extremely different that members of software development meanings. Given these two data sets and teams, developers, testers, administrators a response time goal of 5 seconds for all and managers alike have an insufficient users, looking at only the average, both grasp on how to apply mathematics or seem to meet the goal. Looking at the data, interpret statistical data on the job. however, shows us that data set B not only As an example, I just finished another doesn’t meet the goal, it also probably several-hour discussion with someone demonstrates some kind of performance claiming to understand statistical princianomaly. Use caution when using averages ples who believed that a data set including to discuss response times, and, if at all posfive response-time measurements and a sible, avoid using averages as your only standard deviation roughly equal to the reported statistic. mean was statistically significant. The disPercentiles cussion reminded me that as performance Not everyone is familiar with what pertesters, we not only must know and be able centiles represent. It’s a straightforward to apply certain mathematical and statisticoncept easier to demonstrate than cal concepts, we must also be able to teach define, so I’ll explain here using the 95th them. Worse, we often have to teach these percentile as an example. If you have 100 concepts to people who like math even less measurements ordered from greatest to than we do. Over the years I’ve stumbled least, and you count down the five largest upon some relatively effective explanations measurements, the next largest measfor the mathematical and statistical prinurement represents the 95th percentile ciples I most often use as a performance of those measurements. For the purpostester. I’d like to share them with you. es of response times, this statistic is read “Ninety-five percent of the simulated Averages users experienced a response time of this Also known as arithmetic mean, or mean for value or less under the same conditions short, the average is probably the most com-

10

• Software Test & Performance

as the test execution.” The 95th percentile of data set B above is 16 seconds. Obviously this does not give the impression of achieving our five-second response-time goal. Interestingly, this can be misleading as well: If we were to look at the 80th percentile on the same data set, it would be one second. Despite this possibility, percentiles remain the statistic that I find to be the most effective most often. That said, percentile statistics can stand alone only when used to represent data that’s uniformly or normally distributed and has an acceptable number of outliers.

Uniform Distributions Uniform distribution is a term that represents a collection of data roughly equivalent to a set of random numbers that are evenly distributed between the upper and lower bounds of the data set. The key is that every number in the data set is represented approximately the same number of times. Uniform distributions are frequently used when modeling user delays, but aren’t particularly common results in actual response-time data. I’d go so far as to say that uniformly distributed results in response-time data are a pretty good indicator that someone should probably double-check the test or take a hard look at the application.

Normal Distributions Also called a bell curve, a data set whose member data are weighted toward the center (or median value) is a normal distribution. When graphed, the shape of the “bell” of normally distributed data can vary from tall and narrow to short and squat, depending on the standard deviation of the data set; the smaller the standard deviation, the taller and more narrow the bell. Quantifiable human activities often result in normally distributed data. Normally distributed data is also common for response time data.

Standard Deviations By definition, one standard deviation is the amount of variance within a set of Scott Barber is the CTO at PerfTestPlus. His specialty is context-driven performance testing and analysis for distributed multiuser systems. Contact him at sbarber @perftestplus.com. OCTOBER 2006


measurements that encompasses approximately the top 68 percent of all measurements in the set; what that means in English is that knowing the standard deviation of your data set tells you how densely the data points are clustered around the mean. Simply put, the smaller the standard deviation, the more consistent the data. To illustrate, the standard deviation of data set A is approximately .7, while the standard deviation of data set B is approximately 6. Another rule of thumb is this: Data with a standard deviation greater than half of its mean should be treated as suspect.

Statistical Significance Mathematically calculating statistical significance, also known as reliability, based on sample size, is not only beyond the scope of this column, it’s just plain complicated. Luckily, you can get usually get away with skipping the math by applying some common sense. Since it’s typically fairly easy to add iterations to your tests to increase the total number of measurements collected, the best way to ensure statistical significance is simply to collect additional data if you have any doubt about whether or not the collected data represents reality. Whenever possible, ensure that you collect at least 100 measurements from at least two independent tests. In support of the commonsense approach described below, check out this excerpt from a discussion on the topic from StatSoft (www.statsoftinc .com), a company that provides analytic software: There is no way to avoid arbitrariness in the final decision as to what level of significance will be treated as really ‘significant.’ That is, the selection of some level of significance, up to which the results will be rejected as invalid, is arbitrary. In practice, the final decision usually depends on whether the outcome was predicted a priori or only found post hoc in the course of many analyses and comparisons performed on the data set, on the total amount of consistent supportive evidence in the entire data set, and on ‘traditions’ existing in the particular area of research... But remember that those classifications represent nothing else but arbitrary conventions that are only informally based on general research experience. OCTOBER 2006

While there’s no hard-and-fast rule about how to decide which results are statistically similar without complex equations that call for volumes of data, try comparing results from at least five test executions and apply these rules to help you determine whether or not test results are similar enough to be considered reliable if you’re not sure after your first two tests: 1. If more than 20 percent (or one of five) of the test execution results appear not to be similar to the rest, something is generally wrong with either the test environment, the application or the test itself. 2. If a 95th percentile value for any test execution is greater than the maximum or less than the minimum value for any of the other test executions, it’s probably not statistically similar. 3. If measurement from a test is noticeably higher or lower, when charted side-by-side, than the results of the rest of the test executions, it’s probably not statistically similar. 4. If a single measurement category (for example, the response time for a specific object) in a test is noticeably higher or lower, when charted side-by-side with all the rest of the test execution results, but the results for all the rest of the measurements in that test are not, the test itself is probably statistically similar.

urements are statistically significant and are distributed normally—which is not nearly as common as we’d like for response times. A more applicable definition of an outlier can be found in StatSoft’s glossary: Outliers are atypical (by definition), infrequent observations; data points which do not appear to follow the characteristic distribution of the rest of the data. These may reflect genuine properties of the underlying phenomenon (variable), or be due to measurement errors or other anomalies which should not be modeled. Based on this definition, I recommend that if you see evidence of outliers—occasional data points that just don’t seem to belong—reexecute the tests and compare them to your first set. If the majority of the measurements are the same, plus or minus the potential outliers, the results are likely to contain genuine outliers that can be disregarded, but if the results show similar potential outliers, these are probably valid measurements that deserve consideration. The next question is, “How many outliers can we dismiss as ‘atypical infrequent observations’?” Assuming we’ve made the determination that we have collected a statistically significant sample of measurements, we can address this question. I submit that there is no set number of outliers that can be unilaterally dismissed, but a maximum percentage of the total observations should do as a rule of thumb. If we apply the spirit of the two definitions that we have discussed, we come to the conclusion that up to 1 percent of the total measurements beyond the third standard deviation are significantly outside the rest of the measurements and can be considered outliers. I hope you find this useful in educating the folks who view your results as to what those results truly represent, so they can make informed decisions about the application’s performance. ý

Usually, you can get away with skipping the math by applying some common sense.

Statistical Outliers If we were to ask statisticians what an outlier is, they would tell us that it’s any measurement that falls outside of three standard deviations, or 99 percent, of all collected measurements. The problem with this definition in our case is that it assumes that our collected meas-

www.stpmag.com •

11


J2EE Application Issues Are Often Buried Deep In The Code By Edie Lahav

O

ne of the reasons for the popularity of the Java 2 Platform, Enterprise Edition

(J2EE) is that it decreases the need for complex programming while increasing the reusability of component-based modules. While it’s a boon for developers, this development model places a new emphasis on the IT infrastructure’s middle tier. Working with the different data types, hardware components and J2EE application servers to build an application is a new endeavor for many developers, with a variety of challenges for the test engineers in charge of software quality. This distributed computing environment brings with it a changing group of variables. Unlike legacy architectures, when

12

• Software Test & Performance

a J2EE application goes into production, the distributed environment often behaves dynamically. A range of performance, application and configuration issues can occur that are unique to the dynamic J2EE environment. For the test engineer, discovering how the software components will behave during the application’s life cycle is key to successful QA. Performance problems are often discovered when testers are able to simulate application traffic—so load and stress testing is an important focus area for the test engineer. This article pinpoints some root causes of performance issues in J2EE applications. Testing applications is a challenge in the J2EE arena for several reasons. First, in many organizations, testing roles overlap among QA, development, production and networking. Diversity and changeability in a J2EE system’s many elements OCTOBER 2006


Compound Communication In most organizations, the J2EE development team has a set of testing tools that typically include code profiling. This OCTOBER 2006

ensures that the code works, but it doesn’t test the application under load. So, after the code is “complete,” it’s turned over to the QA team to determine if it’s ready for production. In a more effective approach, test and QA engineers are a part of the development team, and the testing plan is written when the specifications for the application are completed. In this model, the test and QA team know as much about the intended use of the application as the developers, and are able to build a solid plan as the application is being developed. However, if you can’t swing a compound team like that, you can still stay in the loop by scheduling individual meetings with the development team and the application’s sponsor and asking them to describe the application for you. Learn about the expected typical user behavior, including log-in, capacity expectations and any known technical details about the application. Obviously, the development team will share more information along the lines of a product specification, and the sponsor will tell you what to expect functionally, but with input from both, you should be able to piece together a Edie Lahav is director of R&D at RadView. www.stpmag.com •

13

Photograph by Achim Holzem

makes diagnosing a problem very difficult. Varying loads and requirements on J2EE applications and components can cause runtimes to fluctuate, and application performance problems are often buried deep in the application code. In addition, many applications haven’t been designed for test. Even in most J2EE Web applications, performance and memory concerns are the top issues—the “hooks” for testing are typically not well defined in the application development stages. As developers and testers work more collaboratively, this will change: More planning in the development stage will focus on the application’s functional service objectives for performance and scalability. And last, but not least, when a J2EE application has a problem, it’s generally found at the code level. With typically few clues on the surface, the test engineers spend hours re-creating a situation and generally need a full team (development, networking and so on) standing by to watch test and QA try to pinpoint the error’s root cause.


JAVA 2 APPLICATION TESTING

thorough understanding of the application.

Setting the Scene So, you’ve written and executed a solid test plan. You’ve tested the application’s capacity, throughput and log-in functions, and you’ve profiled the hardware and software. Basically, you’ve executed the test plan and you’ve encountered what look like performance issues. How do you do this in the most effective way? You’d like to tell the development team more than “the application broke here or crashed there.” You want to explain exactly why the application

Seeing Inside J2EE Some of the very benefits that J2EE offers developers lie in the features that make the language a challenge for the test and QA engineer. For example, the ability to use object-oriented programming and encapsulate programs can keep the root cause of a problem buried within a component. For many test engineers, J2EE components limit testing to black-box technology because they can’t “see” inside. Test engineers need tools that provide insight down to the component level to pinpoint problems. In addition, the environment poses testing concerns. In the evaluation of

TABLE 1: APPLICATION SERVERS AND PACKAGES COMPARED Application Server Properties

Application Package Properties

Application Server Version

Deployment Date of WAR & EAR files

JVM Version

File Size

JDK Type and Version

EJB Type

JTA Timeout

Classes Location

Maximum Heap Size

Implementation Class

Server’s Classpath

Component Load Order

JDBC Driver JDBC Pool Initial Capacity OS Version

crashed and what caused it. As mentioned earlier, to determine the root cause of problems in the distributed J2EE world, you often need to re-create the problem for a team of colleagues. It’s time-consuming to get the DBA, the network administrator and a development engineer to watch different pieces of the system while QA runs the load test to re-create the problem—and that’s just the first step. After that, this team must sift through the data generated at all these different system points to hunt down that broken moment and determine why the problem occurred.

Logging Error Locations What you need is a log of your test executions that captures a complete record of application performance across the multitier environment in one synchronized timeline. This will provide you with data points at each location where the application may fail, with the details that the developers need to fix the problem. After QA gathers this data, you must be able to analyze it. Looking through the data to locate the bottlenecks takes some J2EE background knowledge, so let’s get down to some basics.

14

• Software Test & Performance

any J2EE problem, your first step is to determine if the error is caused by an application component or a system issue. Test and QA engineers trained in viewing load test metrics against server farms are challenged by the typical J2EE environment of firewalls, Web servers, EJB application servers and database servers. When an application fails, as the test engineer, you must trace that failure through the entire system to the root cause. Along the way, you may encounter some of the following system issues: • Inadequate server resources (JDBC, threads, memory) • Thread pool sizes • Inappropriate server parameters (class path, JVM) • Custom architectural wiring • Low memory • Configuration problems First, you must understand the application’s infrastructure and how the application components act under load. Once you’ve determined that the problem is not an infrastructure issue (all transactions between components and the infrastructure work), it’s time to drill down to

the components to look for the root cause. You’ll need a tool that will enable you to drill down to the component level.

Capturing Performance Data There are several measurement interfaces for capturing J2EE application performance data. You can use the interface provided by the application server vendors for high-level monitoring, or you can use a sampling of a portion of events from an event stream. Sampling, unless implemented as a dynamic practice, often misses specific or sequencerelated problems, so make sure that your testing tool allows you to change your sampling profile (what you’re recording—the type and depth of information) on-the-fly. This allows you to switch back and forth between viewing high- and low-level statistics during the test, because when you begin a sampling, you don’t know when or where you’ll need to dive deeper. A technique called total trace allows you to capture granular data for each event (including arguments) executed within the code. When included in testing tools, this technology often has embedded intelligence that automates resources to manage overhead and returns to you a data log that you can then analyze to pinpoint the problem. The next step for the QA and test team is to replay and analyze the data to quickly pinpoint the root cause of the application problem. Look for tools that provide an interactive log that allows the team to follow a transaction from the client to the Web server to the distributed environment and drill down into the log to rapidly locate the issue.

The Usual J2EE Offenders In testing, it helps to know what to look for in a J2EE application. Let’s examine some of the more common J2EE performance issues and their probable causes. Performance problems can be caused by any number of sources, from incorrect business logic to incompatible components to an unexpected application error, such as an exception that isn’t written to a log file. In your problem-solving quest, begin by reviewing the execution log and look for obvious synchronization, deadlock or crash situations. When you come to a failure in the application, drill down to the component level to view what components OCTOBER 2006


JAVA 2 APPLICATION TESTING

are being accessed or are looking for inefficient database calls. Other issues that can cause performance problems include: • Improper application settings. These can include inadequate sessionstate provider or buffering disabled on a Web Forms page. • Improper application server settings. (For example, an insufficient number of threads allocated to an application.) This can be diagnosed by recording code executed in synchronization with application and application server metrics, including EJBs. Then review the log at the system-level view to pinpoint the problem. • Interoperability with legacy code. • Infrastructure problems. For example, poor network responses that cause degradation in application performance or insufficient hardware, limited processing power, lack of memory and so on. • Unexpected application errors. These can be found by reviewing the history of the code execution and object states at each method call. The component-level view provides visibility into the initial and future failure of the application. • Methods that perform poorly over many transactions. These can be identified by reviewing metrics for EJBs, servlets, JSPs and so on, through a component-level view to find the execution path that leads to the cause of the poor performance. • Memory thrashing and memory leaks. • Consistently and/or intermittently slow methods affecting specific user or data values. Many J2EE applications use thirdparty Web services. These components bring with them an entire range of issues, including inefficient operations, memory consumption, and insufficient application and framework settings. To diagnose this problem, look for inefficient operations such as the application opening a connection to the database upon every request, or poorly performing SQL statements or stored procedures. Then, pinpoint the error by recording the complete test execution at the system level to provide information on a number of method calls and their duration in any window of time. Memory consumption can also point out a performance problem. Identify a OCTOBER 2006

possible error by recording performance counters that can indicate that the cause is a memory issue, such as free memory and so on. Having too many instantiations of a single object running at the same time also raises a red flag. Either the code is creating too many copies of the object or the instantiations of the object aren’t being destroyed after use. Application and framework settings—for example, the maximum number of threads in the thread pool is insufficient—can also cause performance problems. Pinpoint them by recording all the necessary performance counters and messages written to the event log, as well as application and framework settings (J2EE configuration accesses and so on). This data will then note non-optimized settings that caused the application to slow down. Long wait times or poor DB performance can be caused by non-optimized SQL statements or stored procedures. Pinpoint this problem by recording the JDBC performance and drill down to the actual SQL statement that was executed against the DB.

lem, record all the interactions of the user’s application with any external resources or services. The recording should show any unsuccessful Web services calls or failures in accessing a database. • New software that causes an application failure. Often, new deployment of software can lead to an application’s failure. Review the historical log to determine if this is the case. Many times, poor performance is due to configuration errors. Before testing occurs on a J2EE application or when a new application server is rolled into production, it’s imperative that the configuration of the server be diagnosed. Reviewing the configuration before testing can resolve issues before a great deal of time and energy is spent trying to find a problem that was ultimately caused by a server misconfiguration. Some testing tools allow you to compare application servers to determine configuration errors. With this capability, you’ll want to compare the following application server and application file properties, as in Table 1.

J2EE Configuration Criminals

Functional Flaws

To be able to properly test J2EE applications, the test and QA team needs background on the J2EE configuration infrastructure. Ask for a detailed explanation of the application’s infrastructure from the development team. This will provide you with the information for identifying issues such as insufficient permissions to access a resource, incorrect application settings in a J2EE configuration file or conflicts with other applications. Due to its highly distributed nature, a J2EE application has many opportunities to get stuck in the infrastructure. Let’s go through some of the more common configuration issues and ways to diagnose them. • Insufficient permissions to access a resource. For example, does the application have permission to write to an application directory? Solve by reviewing all accessed and attempted accesses of the application by any resource on the computer. • Incompatible components. Locate this issue by recording the interactions of the application with any homegrown and third-party components to identify unsuccessful calls. • Conflict with other applications and or Web services. To catch this prob-

Functional application problems are typically found earlier in the test cycle, but sometimes pop up later during load testing. Here are three common functional problems and some likely causes: • Incorrect business logic. Solve by recording the complete flow of managed and unmanaged components and public method calls. Captured exceptions may indicate a code problem. • Hang/Time-out. Solve by recording the execution of multiple threads and processes at the system level. • Crash. Find crashes by recording application execution at the system level and automatically capturing the crash events.

The J2EE Challenge J2EE offers a rich environment for the developer, and companies investing in these applications expect a promised return on investment that only the test and QA team can ensure. J2EE also offers the tester a challenge—to learn the environment, avoid pitfalls and work more closely with the development team. In that way, you can decrease the time needed to deliver top-quality applications. ý www.stpmag.com •

15


Shining Static Analysis Tools Aid In Delivering On-Time Code Reviews

By Alan Berg

tatic analysis is the process of checking code for patterns that indicate programming faults. Tools such as PMD,

S

QJPro and Findbugs make the automatic delivery of on-time reports with indications of code quality viable. Here, I’ll explore an example of report generation that uses nightly build methodology to deliver on-time agile code reviews. Further, I’ll explain the nature of an automatic code review, with emphasis on the value of the Findbugs tool (http://findbugs .sourceforge.net).

Bug Hunting Findbugs looks for antipatterns known as bug patterns in compiled Java code. The tool searches for a large number of patterns that are common errors; for example, exceptions that are caught and then ignored or classes that are serializable, but parts of which are really not. Findbugs works on compiled code using the BCEL library (http://jakarta.apache.org/ bcel) and performs analysis on .class files. Obviously, before the tool analyzes, you need to compile the code. At an estimate, Findbugs has a list of 100 antipatterns searched for; this list expands with every new version, and if you’re motivated, you can add your own patterns in what appears to be a relatively easily process. Please remember, as the tool is a community effort, that it’s polite to pass back your improvements.

Why Use Findbugs? Static analysis comes of age for large code bases, with multiple teams striving to fulfill stringent targets with feature-rich function-

16

• Software Test & Performance

OCTOBER 2006


Light on Java Code

OCTOBER 2006

How to Boost Static Cling Motivating teams to use static analysis can be a chore. Humans (and yes, even developers) tend to be turned off by critical reports, and placing a new infrastructure in an organization can be tiring. Personally, I haven’t yet found an optimized way of achieving this goal, but I suspect the best time to implement is at the start of a project. Throwing in new technology at the most stressing parts of the QA process results only in polite comments and passive resistance. Instead, try to start at a restful time, perhaps as your team is generating functional requirements and prototypes. Remember to consult your target audience, asking questions like “What are the false positives?” “Which tests are the most valued?” and “Where should we be looking first?”

Inside Findbugs Findbugs has a number of interfaces, including the command line, Ant, Swing and even an Eclipse plug-in. Because Eclipse is a visual environment, the plugin is probably the handiest method to give you a feel for the tool. You have two approaches to installation: download a zip file and expand it into the Eclipse/plug-in directory, or go via a live update. To install in Eclipse 3.2, follow the installation instructions mentioned in the Findbugs installation at http://findbugs.cs.umd.edu/eclipse. Once installed, to activate Findbugs, first create a Java project and then rightclick on the project and select Properties > Findbugs. The configuration screen shown in Figure 1 pops up. Select “Run

Findbugs automatically” and set “Minimum Priority to report to low.” By doing this you can see more of the rules as they’re automatically applied. I suggest that you browse the bug codes and compare them with the descriptions found in the Findbugs bug description at http://findbugs.sourceforge.net/ bugDescriptions.html. Once the settings are in place automatically, your code will be tested and the errors will be reported in the Problems pane, which is conveniently located underneath your source code by default. To gain insight into the error types, try writing deliberately bad code and then test it to the Findbugs set of practices (see Listing 1). To generate the results shown in Figure 2, create the class file under your own project and then right-click the project in the navigation window, choosing Find Bugs > Find bugs. Notice the seven warnings and no errors. Working your way through the discovered bugs from top to bottom, by double-clicking on the issue listed in the Problems dialog, moves the cursor to the line mentioned in the source code. Table 1 describes each warning in the relevant source code and a broader description of that warning. Again, for your own education, you should make such a table yourself.

A Functional Plug-In At this point, you should have a funcAlan Berg has been a lead developer at the Central Computer Services at the University of Amsterdam for the past eight years. Contact him at alan.berg@chello.nl.

www.stpmag.com •

17

Photograph by Ravi Tahilramani

al requirements. In this extreme and acidic type of environment, the velocity of change and inconsistent code quality across teams makes it costly to QA by human effort alone. Quality assurance focuses on the most important features of any given product, normally functionally testing a thin path within the whole code base. One issue that is not always addressed is the consistency of bad practices by a given team, or the general level of defects. Static analysis works from the bottom up and pays attention evenly throughout the whole code base. The tool can be considered a success if the analysis doesn’t generate too many false positives. Under these conditions, the tool acts as a neutral and objective observer with abundant patience, defanging the potential for very real and human friction. When one subproject is failing relative to the quality generated by another, the statistics will point this out quickly and painlessly—Findbugs doesn’t feel pain. Further, static analysis captures a large swath of real errors. Not all of them are particularly interesting; sometimes the reports sound more like nags. However, the number of real errors, trivial or not, is around the 50-75 percent correct rate. This efficiency affords coding teams the time to breathe and then fire off reports and act on them—with a significant boost to quality. In the long run, this will simplify debugging of the more challenging issues as background noise is diminished.


CODE REVIEWS

FIG. 1: THE FINDBUGS PROPERTIES PANE

choose to use all or several of the tools in parallel near the beginning of the cycle in development. I choose not to, so as to avoid overwhelming the audience with false positives and other information. However, as your audience becomes more sophisticated, you can use more sophisticated tools, as well. To run the default tests against your compiled package, you’ll need to get to the Findbugs bin directory and execute the following command: ./findbugs -textui -maxHeap 1024 -html:fancy.xsl -outputFile test.html location_of_your_compiled _source code.

tional plug-in. I strongly recommend getting your hands dirty by writing some buggy code and seeing what the tool makes of it. Findbugs generates valuable feedback for new developers, and Eclipse interactions can serve as effective training. For large projects with many different players and teams involved, such an upward push is also helpful. The weakest link is the one that the user tends to remember: If your service is working 99 percent but the last 1 percent is troublesome, that troublesome code will probably diminish your company’s reputation. But if it’s enforced as a generic part evenly applied across the whole of your software cycle, that last 1 percent may not be so problematic. Next, let’s look at the possible approach of integration into the development process.

tools • Deploy • Perform static analysis of code • Once satisfied, tag code and deploy from development to acceptance environment • Perform functional tests with tools such as Anteater (http://aft .sourceforge.net) • Perform static analysis of code via extra tools such as PMD or QJPro • Perform functional tests • Stress test if acceptance and production are realistically similar • Deploy to production I test regularly in development with one static analysis tool with its own rule set and then try out a couple of others in the acceptance phase, but many testers

-maxHeap is the maximum amount of memory Findbugs may consume. -html:fancy.xsl tells Findbugs to generate an HTML report via an XSL transform from the built-in fancy.xsl style sheet, and test.html is the html name of the report. I’ve run this command against a 440,000-line Java code base. The report generation cost two or three hours of computational effort on a dual-core desktop computer. If you apply this effort to the nightly build process, the report will be viewable by the developers when they come in to work the next day. Just add the command to a cron job, making sure that the generated report lies under the document root of a conveniently configured Web server. Figure 3 is a screen grab of a realistic HTMLized report. The top row offers the option of viewing bugs by category or

FIG. 2: RESULTS FOUND BY THE FINDBUGS ECLIPSE PLUG-IN

Nightly Build Using the Eclipse plug-in and playing with code can provide insight into the quality of the reporting and the value of the different issues measured. However, this approach isn’t scalable to most large preexisting code bases. Findbugs, in combination with a nightly build structure, is better suited to the task. Via the command line, I generate a dynamic HTML report. To achieve this, the source code must have been compiled previously, and so this approach sits snugly in the nightly build infrastructure. The workflow could be similar to the following: • Build the source code via Ant, Maven or with your own specialized

18

• Software Test & Performance

OCTOBER 2006


CODE REVIEWS

package, offering a dynamic JavaScript/Dom tree that you may navigate and helps you locate problems.

FIG. 3: SCREEN GRAB OF A FINDBUGS HTMLIZED REPORT

Refactoring in Reviews Code reviews are excellent tools for spotting smelly code—code that’s rigid and easy to break, or too complex to understand. From my own experience, I notice that defects tend to cluster—hidden in the smoke there tends to be fire. Therefore, even if the reported issues are trivial, the relative number of defects per subproject can suggest the direction of your vectored refactoring efforts. Assuming that you have time to refactor and don’t get stuck in the frantic problem-solving stage, Findbugs can be extremely useful. Here are some rules of thumb that can help with refactoring: Caught Exceptions: If your code base is more than a bit buggy, log previously ignored caught exceptions. You may find a large amount of information delivered to the log files. Therefore, start by trawling the HTML report per package. One motivation for removing ignored exceptions is that the end user may believe that their transactions have succeeded when the issue has merely

been ignored. If the code sits in this part of the application, add end-user feedback logic. Finally, the code may fail somewhere else, but the problem may originate elsewhere. Logging even unsuspected code may bring this type of error to the surface. Code complexity: Once the class length in lines of code is greater than, say, the length of a page in A4 format or if nested if statements are too nested, developers will have more problems under-

LISTING 1: A CLASS WITH DUBIOUS PROGRAMMING PRACTICES import java.io.IOException;

public class FailOne { String test; Object example; public FailOne(){ this.example=new Object(); } public void setAction(String action){ action="I have changed"; } public void badException()throws IOException{ System.exit(0); System.gc(); try{ throw new IOException(); }catch(Exception e){ // Het lets ignore bad news } } public Object getExample(){ return this.example; } public synchronized void setExample(Object example){ this.example=example; }

}

OCTOBER 2006

standing the code. This challenge translates into higher defect levels and many extra opportunities for failure. Worse still, high code coverage via regression testing such as Junit tests is much more difficult to write in these situations. Therefore, keeping the complexity out of the code increases code quality. Complex code probably means that you need to reexamine your design patterns. Other refactoring tools such as PMD (http://pmd.sourceforge.net) and QJPro (http://qjpro.sourceforge.net) are superior to Findbugs at finding complexity. These tools also work on uncompiled Java code, thus avoiding the sometimes unnecessary step of compilation. By adding a second or third tool, you increase the number of rules checked— but you also boost the background noise of more false positives. Finding the balance can be difficult. Some testers try a varied approach: Apply one tool daily and then occasionally verify with a second and third tool. Junk DNA: As code bases grow, functional requirements change. Through this erosion process, parts of your code base may no longer get regularly exercised. Removing this junk DNA helps keep the developer concentrated. PMD has an Ant task that spots duplicated code and outputs the duplicates to a text file for further analysis. You should throw away or factor out into common utility classes. Performance: Findbugs looks for about 10–20 potential performance issues such as string concatenation within loops or when to make an inner class static, thus decreasing memory footprints. Performance problems created in the heat www.stpmag.com •

19




CODE REVIEWS

TABLE 1: WARNINGS, SOURCE CODE AND EXPLANATION Error

Source Line Mentioned

DE: FailOne.badException() might ignore java.lang.Exception

}catch(Exception e){ // Het lets ignore bad news }

Ignoring exceptions can stop your code from failing in the short term. However, errors indirectly caused by this may mean that you’re looking in the wrong place in the code whenever an issue arises.

Dm: FailOne.badException() forces garbage collection; extremely dubious except in benchmarking code

System.gc();

During garbage collection, everything else in the given JVM stops. Therefore, this action has potentially severe performance issues. Use this command sparingly and singularly.

Dm: FailOne.badException() invokes System.exit(...), which shuts down the entire virtual machine

System.exit(0);

Unless you really want to stop your application totally and kill the JVM, you should throw exceptions rather than exiting.

IP: Parameter action to FailOne.setAction(String) is dead upon entry but overwritten.

public void setAction (String action){ action="I have changed"; }

The input parameter action should never be changed due to risk of side effects. This is usually an accidental and hopefully rare mistake.

IS: Inconsistent synchronization of FailOne.example; locked 50% of time

public Object getExample(){ return this.example; }

If you synchronize the set, for the same reasons you should probably synchronize the get method.

UG: FailOne.getExample() is unsynchronized, FailOne.setExample(Object) is synchronized

public synchronized void setExample(Object example){ this.example=example; }

UuF: Unused field: FailOne.test

String test;

of coding are hard to spot in a forest of lines and powerfully provisioned system resources. Common errors: One of the stated focuses listed on the Findbugs homepage is to eliminate commonly found errors and typos.

Excellent—and Free I found Findbugs to be an excellent—and free—static analysis tool. It can be used as an Eclipse plug-in as a training tool or from the command line as an enterprisewide tool. Every new version brings with it the ability to discover new issues. All in all, the tool is an honest broker that helps

22

• Software Test & Performance

Description

The field test is never used. In general, this is due to one of two reasons: the code has still to be written or it’s obsolete due to the speed of change.

developers keep their code to a consistent minimum standard. Whether it’s Findbugs, PMD, QJPro or another static code analyzer, this type of tool, when placed in a nightly build structure, allows for daily reports on quality and zooms in on particular issues. It enables you to detect defect numbers per package and helps you determine where to start your debugging and refactoring efforts. What’s the main weakness of static analysis? It produces high volumes of detailed information. However, once developers accept this approach, especially if it’s vectored, they’ll reap significant improvements in quality. ý OCTOBER 2006



Get Quality


By Nigel Cheshire

T

he goal of any software development organization is to deliver quality code to specification, on time and within

This 6-Stage Template Can Lead You To A Successful

budget. To meet this goal, development organizations must have the right processes in place. Many organizations have processes, but not many standardize them across teams, and some even allow developers to choose whether or not to actually follow them. Developers often write code to loose requirements or out-ofdate specifications, perform cursory checks and expect their test group or QA department to identify any bugs. This method of building code offers no way to track behaviors in order to facilitate improvement. Organizations need to have processes and procedures in place that allow them to measure performance in a meaningful way. So what can development organizations do to alleviate these problems? With trend analysis informing a quality-driven process, you can gain visibility into your development processes and track developer behaviors for continuous improvement throughout development.

Imposing a Process

QualityDriven Process

Trying to impose a process on an established development team may seem like a daunting task, especially for organizations with a large existing code base. And applying process is also a challenge for new development teams who may not know where to begin. One of the biggest roadblocks to improving the quality and efficiency of the development process is fear. Development organizations are often reluctant to change process because accepting the way existing processes work (as suboptimal as they may be) out-

weighs the fear of new systems or processes interrupting their projects for any prolonged period of time. A phased approach to quality-driven development allows organizations to proceed through a series of steps, each building on the last. The goal is to gain visibility, understanding and refinement, not only into the behaviors that affect different areas of quality, but also with an eye toward the remaining time on current projects and the structure of future projects. The idea is to learn from yesterday in order to plan for tomorrow. Here are the six necessary stages an organization must implement to create a successful quality-driven development process: • Goals and standards: Identify specific goals and standards • Tools: Facilitate standardizations and data collection • Automation/Repeatability: Eliminate the human factor • Measurement: Collect key data from systems/tools • Analytics: Consolidate data into one central repository • Trend Analysis: Track trends over time and looking to the future Although the stages can be sequential, development organizations may find themselves at any one of these stages without having passed through the previous one. In many cases, they must take a step back to progress further. First, development organizations must identify what goals they want to achieve. Do coding standards need to be enforced? Does 100 percent of the Nigel Cheshire is CEO of Enerjy Software, a division of Teamstudio. He holds a bachelor’s degree in computer science from the University of Teesside, England.

Into Your Head OCTOBER 2006

www.stpmag.com •

25


PROCESS IMPROVEMENT

FIG. 1: THE TRENDS OF THREE DEVELOPERS

code base need to be tested? Must highpriority bugs be completely eliminated? Perhaps these goals are too aggressive. The absolute values of the metrics differ based on each organization’s corporate standards. Once the process is in place, these goals can be achieved by implementing a source-code control system, a unit-testing framework and/or a code analysis tool at the start of any development project. Automating any task, such as a build process, will also help to minimize human error and gain repeatable results. To facilitate trend analysis, the highest level of this phased approach, each stage must be built upon the last, as outlined above. It isn’t necessary, however, to halt all development work to implement a quality process—the process itself will be built over time. For example, organizations with a large installed code base may want to ensure that all new work is unit-tested so that over time, their test coverage steadily increases. Similarly, with a source-code control system, from a particular install date, all code changes can be tracked. Watching these quality metrics improve over time as their positive influence spreads through the entire development team is just one of the benefits of trend analysis. Now, let’s examine each level in a bit more depth.

Standards & Goals – Level 1 To begin implementing quality processes, standards and goals must be set. Every organization has standards and goals, but determining which are suitable and which are generally accepted

26

• Software Test & Performance

as industry best practices is a different matter. Existing development teams can derive some standards from the process they already have in place. Setting high standards or goals at the onset ensures that subsequent steps in the process are similarly focused. When determining a set of standards and goals, team involvement is important because it will make the remaining stages easier to implement and ensure a smoother flow throughout the process. When key team members participate in defining standards, not only does it ensure their buy-in, it also helps generates good ideas. Some typical Java development standards include: • General code compliance to a set of defined standards (for example, code documentation and unused elements) • Unit testing coverage to an acceptable percentage • Tracking bugs and issues back to the relevant files and authors • Customer sign-off on working prototypes that conform to a subset of requirements • Enforcing design patterns to decrease cyclic complexity and tight coupling

Tools – Level 2 Once a set of standards and goals is in place, the next stage is to implement the necessary tools that will help development teams meet these standards. To facilitate teamwork and streamline processes, all or a subset of basic coding tools may already be in place. These may

include IDEs, source-code control systems, code analysis tools, testing frameworks, coverage tools and bug tracking tools, with the understanding that these tools make the developer more efficient and the structure and control of the project more manageable. Many organizations are currently at this level. These tools don’t necessarily have to be standardized across development teams, because some developers may have experience and preference using certain products. It’s essential, however, that they are used throughout the development process to drive productivity, increase the quality of output and ease the day-to-day tasks that put developers at risk for human error. The output from these tools and the metrics they generate is a critical step in bringing increased visibility and improved quality practices into the development process; for an example, Figure 1 illustrates the trends of three individual developers and their adherence to standard compliance over a specified period.

Automation/Repeatability – Level 3 Manual processes, especially repetitive tasks, are highly error-prone. With the correct setup and configuration, process automation within the development life cycle will eliminate the potential for human error and speed product delivery through repeatability. Productivity will also rise when boring, repetitive tasks have been automated, leaving the development team free to focus on competitive enhancements. Process automation can range from a developer pushing a single button to entirely automatic scheduling. Using such tools will allow computers to perform time-consuming and processorintensive tasks offline or after hours. Other tools facilitate automated reporting (such as alerting the system architect of a failed build via an SMS message or pager alert). Another example is an automated build process that is essential for repeatable, clean builds. Java development teams may use an Ant or Maven script for one-button processing from compilation to deployment.

Measurement – Level 4 At this level, organizations begin to concern themselves with quality results. The previous levels provide a good foundation for quality practices with the resulting productivity improvements, but OCTOBER 2006


PROCESS IMPROVEMENT

quality can be realized only when output from the lower-level processes (or a combination of them) is measured. To introduce new development practices, measurements must be used to identify where resources should be concentrated. For example, in unit testing, the pass rate is a key measurement. But what is an acceptable pass rate? Is it enough that all 5,000 unit tests have passed? If 100 percent of those tests pass, this may be a sufficient level of quality. However, if the 5,000 unit tests cover only 20 percent of the code base, the application quality level is less certain. Only through measurement and cross-referencing of results can organizations obtain a clear picture of quality. In this case, it might be useful to implement a standard in which developers can’t check in their work until the testing percentage has met a predefined threshold.

Analytics – Level 5 With the implementation of quality practices and the use of coding tools, organizations are now able to collect metrics on key data such as standards compliance, test coverage, customer satisfaction and design flaws. For development teams, analyzing data allows managers to identify behavioral trends or insufficient processes, but in reality, few organizations are truly at this level. To determine if an organization is set up to gather and analyze key data, ask questions such as: • Who was responsible for the last bug reported by a customer? • Was the code tested? • Are a significant number of issues assigned to this person? • Who on your team has no bugs outstanding, no standards violations and meets any of the thresholds of test coverage set? Pulling this data together provides teams with instant visibility into developer behaviors and trends, enabling development managers to focus on necessary training and resource allocation. See Figure 2 for a sample at-a-glance view of project trends in various quality areas such as unit test coverage and standard compliance.

to adjust and streamline processes are ahead of most software development organizations in their industry and thus have a significant competitive advantage. At this stage, an organization’s

• Only through measurement and cross-referencing of results can organizations obtain a clear picture of quality.

• processes are optimizing the efficiency and quality of project output and deployed applications. By tracking trends over time, management can identify what is happening throughout the course of the development project, and analyze the results to identify and correct problem areas well before the project’s last few weeks.

Visibility Into the Life Cycle A quality-driven development process provides visibility into the development life cycle by consolidating data from key processes, tools and standards, such as unit testing results, code coverage percentages, best practices, compliance to coding standards, overall activity and

bug detection. Most Java development tools create metrics from which limited decisions can be made, but these are typically mere snapshots of data at a single point in time. Development tools also don’t consolidate data from each other; a code coverage tool may not show who is responsible for untested code, an important factor when analyzing developer behaviors and trends. What is needed is a system that consolidates data from key development tools and systems, allowing development organizations to track code-quality improvements over time. Refining the quality process is definitely an ongoing task, especially with external factors such as Sarbanes-Oxley driving the change process. However, if an organization is at the trend analysis level in the process, these changes will be easier to implement because a good management system is already in place. This allows the development organization to take positive action with less financial impact. By implementing a quality-driven development process and analyzing these key metrics, development managers can optimize the performance of their teams, minimizing time wasted on avoidable rework, tracking down bugs, or lengthy or ineffective code reviews. Development teams can quantify and improve application quality at the beginning of the development process, when it’s easier and more cost-effective to address problems, and predict future trends based on their own experience. ý

FIG. 2: AN AT-A-GLANCE VIEW OF TRENDS

Trend Analysis – Level 6 Organizations that use trend analysis by correlating data from different sources OCTOBER 2006

www.stpmag.com •

27



13 Steps to Automation Script Success Are Your Automation Scripts Too Fragile? Go Step By Step to Help Make Them Bulletproof

he purpose of this article is to show how to make effective automation scripts. The article isn’t tool

T

dependent—many tools on the market can do this task, and the information offered here is appropriate no matter which tool you’re using. However, since I don’t refer to any specific commercial tool, some of the described steps may not be necessary for some tools. OCTOBER 2006

Here, I’ll present 13 steps to convert the script that a tool—any tool— creates into a program that is reusable, structured, nondependent, easily maintainable and bulletproofed. The result is the conversion of the initial script into a structured program, avoiding the most common problems. But first, let’s review some underlying concepts.

Automation Objectives When I refer to visual automation, I mean automation through a graphical

user interface. The objective of any visual automation project is to obtain a test script that reproduces the user’s actions with the GUI. But for good automation, the objective is to obtain scripts that: • Are able to run without any human intervention, typically after the compilation during the night. José Ramón Martínez is a quality assurance manager in the I+D department of Nextel in Madrid. www.stpmag.com •

29

Photograph by Joe Stone

By José Ramón Martínez


AUTOMATION SCRIPTS

• Are able to guarantee the regression of the products tested. Thus, faced with new characteristics of the product, environmental changes (operating system, database and so on) or after corrections, the scripts can guarantee that the functionality of the product is the same as it was before. Here, you suppose that the former behavior was the “good one” and that it was verified. • Are able to work in any known or unknown condition until they finish all the work, always making all possible test cases. • Show error localization clearly and unmistakably in their results. • Allow each test in the set to be run independently. • Permit most tests of the set to be combined in any possible order. Thus, the combined test can be arbitrarily long. • Guarantee that at the end of the execution they show a clear and informative report of the results that is auto-registered in an execution repository.

When to Automate In choosing automation, consider three conditions: • You can automate when the cost is recouped in a few runs—that won’t be done by hand—and in a short period of time. If the automation takes more time than manual execution and you’re executing the manual test only a couple of times, why bother automating? • You can automate when no important bug is found during the process. If an important area is not tested or not fully tested, or if bugs prevent the test from running, don’t waste time automating. • You can automate an interface when the visual presentation and the behavior are stable and correct. If not, you’ll have to redo your test again and again. If any of these conditions aren’t met, automation is a waste of time—no matter what your manager says!

30

• Software Test & Performance

Inside the Tools: The Object Map To make a good script, you must understand how the tools work—at least the most common tools. All tools offer some useful information (properties) of each element that the user “touches.” This information is used for recognizing and differentiating the elements. With this information, the program generates a map of the objects (the name varies according to the program), with all the information that the program considers useful for recognition and use of the elements in the script. Therefore, you have two sets of elements to manipulate: the script code and the object map associated with it. You’ll use both in the automation process. First, the environment must be stable in behavior, with no changes foreseen in the element’s placement, although this last condition is not as important recently—now, programs use more than a property for identification of elements, employing “threshold” and “let’s look for it around” techniques. However, it’s still important, since a program that has “moving” elements is probably changing more than the elements’ positions. Now you’re ready to get automated, step by step.

1. Plan the Script Before doing any automation, you must plan what to include in it. You

need to: • Decide which action or feature to automate, such as printing a document. • Choose what screens and elements in each screen to use. • Examine and reproduce the actions you want to automate and ensure that no bugs hamper the actions. The execution path must be clear. • It’s also very convenient to write (in the place and format you choose) a little description that associates the name of the test with the things it does, so it will be easy to run and understand the test in the future. It’s challenging to translate a manual test into an automatic one. Further, it’s impossible to create an automatic test like a manual one—and even if it were possible, it’s not usually desirable. The manual test has many details; it depends a lot on the tester observations and doesn’t need any initial stable state. All of these details, and more, are impossible to achieve in an automatic test without high cost. You can start from a manual test, but usually you’ll have to simplify it considerably.

2. Save/Save the Environment In the recording process, you must first save the objects you’ll use in the script. Keep in mind that the program lets you automate only the objects it knows. Thus, you must teach the program the main elements it has to learn. The procedure changes depending on the program, but the key point is the same: You can never suppose that the program knows automatically all the things you can see onscreen (all the main windows or all of the tabs); it will learn only the things you touch (and sometimes the local father or children). Thus, the best procedure is to touch with the mouse all the elements and sub-elements (for example, cells in a grid). In this way the program will learn them for sure. In other programs, the only procedure for learning the objects is to save a test script that includes them. After saving the objects, now you OCTOBER 2006


AUTOMATION SCRIPTS

proceed to record the first script. Thus, you record all the actions you’ve planned and then stop recording. Voilà! You have a complete script that includes all the actions you want to automate. This script contains a unique function that creates everything. This function can be very long, in a daunting block of code. But you’re not done yet: if you run this initial script again, most of the time it won’t work properly—you need to shape it.

3. Eliminate Variable Dependencies In the next step, you’ll edit the object map and change the properties of the objects the program uses for recognition that can spoil the script. The values you’re looking for are the ones that can change in subsequent executions of the test (and this execution can be in another computer). For example, if the name of a button changes (the same button is used for various purposes, for example), or you want the application in various languages, the logical course is to eliminate the name of the button as a property or, if the program allows it, to change it to a regular expression such as *. Also, in the code, you must eliminate all the coordinates’ dependencies. That way you’ll avoid the problem of not finding an element due to a change in its screen size or position inside the application. Sometimes it’s impossible to eliminate the coordinates’ dependencies because the only way to recognize some elements is by using them.

another script. Thus, each script will deal with only one application or feature. The next logical step is to modify the scripts, dividing them in functions that implement the different actions that you’ve recorded: “SaveFile”, “LoadFile” and so on. These functions will follow the criteria of maximum cohesion and minimum coupling: The name of the function will define a unique action that will be the only thing the function does, and the function won’t depend on any other function. Also, the functions must be short. During this process, you abstract the functions and group the common ones. Thus, in a natural way, you obtain functions with arguments; for example, “GiveValueToUser(cell x, cell y, name),” that you can reuse with minimal effort. Now, in every script, you write (not record) a main function that calls all the functions you’ve created. This main function won’t include any recording logic; its only mission will be to check that all the functions you’ve made work properly. Next, you create a new blank script, in which you design a directory function— main—that will drive the calls to all the rest of the scripts and functions. This directory function will call only these functions, without any recording logic. With this main function, it’s easy to change the order of the calls, to extend the test (for example, with a loop), to make new test combinations and so on. Notice that this directory function won’t call the main functions of every script, but the functions they have implemented. This main function will drive one test, which will be considered to be a test unit. Finally, in another blank script, you write (not record) a main function that calls all the test units you’ve created (in any order, repeating some of them). This function will contain the whole test you want to run, and this

It’s impossible to create an automatic test

4. Modularize the Script

like a manual one—and even if it were possible, it’s usually not desirable.

The result of the first recording is a monolithic script without separations. Now you must modify the script’s code to make it modular. First, each script must deal with only one application or feature to allow for ease of reuse. Sometimes you must move (or even erase) all the references to other applications and include (or record again) them in OCTOBER 2006

test main will call the main functions of every test unit that lacks recording logic.

5. Write Comments To make the functions you’ve created understandable, you must add the necessary comments, in at least two places: • In the header of the critical functions, they explain what the function does, its parameters and its results (error exits included). This is very important in functions that are considered “library” functions. If you have classes, you must include header comments in them. • In situations that require difficult decisions or decisions that aren’t immediately understandable. Remember that code is read many times, but is written only once. As a norm, document the script’s code the same way you’d like to read it from somebody else’s script—don’t forget the Golden Rule.

6. Make Test Units Short With test units, you have to keep it brief. That way, you can limit the damage of any error in the middle of the entire test and can combine different test units like puzzle pieces. You can also change a whole unit without affecting the rest of the test. Debugging the script’s code is also easier with shorter tests, and the tests themselves are more robust because they have fewer dependencies. And, as noted before, short tests can generally call many different scripts’ functions.

7. Extend the Script’s Code Now that the program has learned the objects, you can make the test grow from within the script without further recording. That way, you enlarge the test without having to record anything else. For example, it’s usual to introduce a loop, changing some values in every step. All this implies that it is necessary to know the language of the script and to have some basic programming skills. At another level, by combining the test units in different order, you’ll generate new tests in an easy way. In this process, if it becomes necessary to learn new objects, there are mechanisms in the programs to learn them. In some programs this will oblige you to record more actions; in some others, it will be an object map www.stpmag.com •

31


AUTOMATION SCRIPTS

addition. Of course, if you need to add more functions or scripts, you can record them and use them in the test. Also, for extending the code, don’t feel constrained by the things that the program can record. Automation tools give you a computer language that usually allows you to program nonvisual actions from directly “under” the interface. For example, you can usually make direct database access for checking a result.

8. Ensure the Environment From the Start The scripts you’re creating must always execute under the same conditions as they were recorded. Only in this way can you guarantee that the results are comparable to those from the recording. So you must create functions that guarantee that the environment is correct. These functions will check the environment at the start of the test unit and force it to be adequate if necessary. Typical functions: • Determine if the process is already executing and eliminate or create it (depending on the case). Note that most automation tools have problems if there is more than one instance of the program running. • Check/delete/insert necessary data in the database. • See if memory is adequate. • Check the screen size, number of colors, presence of a mouse and other factors. Some of these functions, with some generalizations, will end up as library functions that you’ll reuse later. Another good practice is to reset the environment at the end of the test unit. Usually this implies cleaning the database of the data introduced during the test unit. In this case, the objective is not to ensure a “clean” execution—you ensure this at the start of the test—but to avoid the accumu-

lation of garbage that can spoil other test units.

9. Add Trace Data And Verification Points Next, you’ll include trace information in the scripts. Trace information will inform you where the execution is and what the test is doing at any time, including error messages with essen-

comprehensive. From this trace information, the execution is explained in an intelligible way. You can even save trace information and consider it a baseline for comparing different executions. Of course, the saved trace must be in a fully correct execution. In some programs, the program can automatically check the values of some visual elements during the execution, called verification points. On checking these verification points, you can decide if the action the script reproduces has run well or not. The verification points have to be carefully chosen for their relevance, and must show unequivocally the current state of operations. The values chosen must be unique in the current state; any deviation is an error. Traditionally, the values chosen are text literals, but this isn’t mandatory; they can be in the form of images, windows presence/absence and so on. You can also implement these verification points without reading the interface; for example, instead of reading the text box with the results, you can get this data from the database (thus working “under” the interface). Without traces or verification points, the only thing you can know about an execution is if it’s globally right or wrong, but this doesn’t mean that every action in the test is necessarily right. With traces and verification points, you can ensure that the entire execution is right or pinpoint where it goes wrong. Without them, a test is not valid.

With traces and verification points, you can ensure that your entire execution is right—or pinpoint where it goes wrong.

32

• Software Test & Performance

• tial information about the problem. With trace information, you can follow the evolution of the execution. The trace’s text must be clear and

10. Protect Code And Restore State

Now you have the code divided in a natural way into functions. You need to enforce it to prevent the script execution ending prematurely. You must introduce try-catch-finally structures to deal with possible exceptions. In addition, you must create recovery functions that recover the system to its state at the time the problems arose. OCTOBER 2006


AUTOMATION SCRIPTS

These functions can even reinitiate nets or machines or execute database functions (restore the database from a backup, for example), and are so useful they usually end up as library functions. No matter how complex they are, they don’t have many maintenance problems because they work under the interface.

11. Reuse the Code And Generate Libraries When an application is automated, many of the tests share common elements, and it’s logical to try to reuse functions and their objects in different scripts. These reused functions must be very robust and strongly parameterized. They must also have comments, as well as a clear and explicative trace. If the automation functions are well designed, they can be easily reused. They can be extended without having to be rerecorded; you’ll have only to combine the functions created. On the other hand, the code will be readable and mantainable like any quality software program. Finally, one of the objectives of efficient design is the generation of a group of library functions that can be reused in different automations. In this case, these functions won’t be reused between various scripts of the same project, but in entirely different automations. This implies that these functions must: • Be very defensive in their code. They must check everything, even the most obvious details. • Be strongly documented. • Have maximum cohesion and no coupling at all. • Have a clear, comprehensive trace that needs no further addition. • Be very robust against any possible error. • Be parameterized in a flexible way. • Include informative comments about the functions, their use, complex decisions and so on. Header comments are mandatory. • Be stored apart from the rest of the scripts in a different file under configuration control. Take great care in maintaining these functions—you won’t know how many tests use them. Thus, any change must guarantee an identical behavior OCTOBER 2006

in every possible scenario, so you won’t affect the existing tests.

12. Build a Bulletproof Script Last, our script has to be bulletproofed against any error, whether due to the test or to external elements. You must always guarantee that in any known or unknown error, the system will continue to the next test unit. Also make sure that this next test and the following won’t change their results because of the previous error; that is, you must ensure recuperation to a proper state after each possible failure. The procedure tests the values and, if they aren’t useful for the next element, establishes the proper ones. Since in real tests, this is difficult to achieve, after every function you should check the result and establish the right values for the next function. The code will evolve and the execution will never end. In real tests, when a script fails in the middle of an execution, the entire script is spoiled to the end of the test unit. Checking and restoration is done only at the beginning of every test unit. If any test unit fails in any way, you must ensure that the rest execute properly.

13. Document the Test Finally, write a test description, in the proper format and using the proper repository. Document it as well as if it weren’t your test. Declare what the test does, and the actions you’re testing. Remember that this description should be useful for any person using the test in the future. Don’t focus on how you test, but on what you test. Also, the description must be written and redacted properly—and the value of a clear description is worth the trouble. The result of all your efforts will be a viable program—not just a script, but well-built software. You’ll store the program in a code repository just as you would the rest of your software. It will be subject to version control, development branches and so on, just like any other software element. ý REFERENCES • James Bach, “Test Automation Snake Oil“ (white paper, 1999) • Mark Fewster, “Jumping Into Automation Adventure with Your Eyes Open,” Journal of Software Testing Professionals, March 2002. • Bret Pettichord, “Seven Steps to Test Automation Success” (2001, revised edition of paper originally presented at STAR West, San Jose, November 1999)



Best Prac t ices

No More Webcasts, No More e-Books... more obscure features of In early 1998, I began a sixthe application, I found year career at Intel. During myself thinking more about my first day on the Folsom, the mocha I’d be having at Calif., campus, my hiring lunch than the module I was manager mentioned I’d be supposed to be learning. spending “some time” in Near the end of the monthtraining before being long session, I realized that shown my customer service my comments on the regucubicle. lar feedback forms were Fine, I thought. How influenced as much by the long could it possibly take Geoff Koch generally high quality of to learn to book and ship Dave’s dry humor and pass-it-around canorders of Pentiums? dy as his lesson plans and case-study For the record, the answer was nearexamples. ly one month. I thought back to this Intel experience At the time, Intel used the SAP R/3 as I reported this column. Training system to manage its global web of facremains a fact of life for people working tories and assembly-test facilities. in technology, be it in customer service or Touching the system conferred no small more hard-core coding and testing roles. amount of power and responsibility, since In a world of service-oriented-everywith just a few keystrokes nearly anyone thing, businesses’ networked nervous syscould disrupt the movement of millions tems are more powerful than ever. So of dollars of microchips around the training is required not only to help develglobe. So even customer service lackeys opers stay safe when test-driving new techlike me had to be fairly well-conditioned nologies, but also to see new connections before being given the digital keys to the within the expanding grid of Web-based kingdom—namely, our username and byways and highways. Oh, and the act of password to the production environment. learning still must be lively enough to preArmed with little more than a bachvent students from snoozing. elor’s degree in economics and a desire to work with big numbers, I remember Do As I Do initially being impressed that I’d be helpOne trick to training success, no matter ing to direct traffic in such a lucrative the topic, is how the subject matter is conglobal supply chain. (“I’ll really be booking seven-digit purchase orders?” I veyed. Academic experts seem to agree remember thinking.) that, hands down, modeling behavior is Do you need someone to load more the best way to teach nearly any technimemory chips onto the line in Malaysia? cal skill. Lectures, computer-aided Change the price of a batch of then-topinstruction and good old-fashioned selfof-line Pentium II processors? Check on study from a manual or book simply can’t the availability of chipsets in Arizona? I hold a candle to watching someone perwas your man, at least in the sandbox form particular bit of tech workmanship training environment where we trainees and then attempting to reenact it. practiced. Skill acquisition is improved when socalled retention-enhancement activities are included in a training module. In a Jokes and Candy However, somewhere into the third week, 2003 article in the journal Information as Dave, our trainer, delved into ever Systems Research, University of Arkansas OCTOBER 2006

business professor Fred Davis described how these activities work in practice. All of Davis’s study subjects, undergraduate college students learning to use a spreadsheet application, were given a chance to watch and mimic an expert perform specific tasks. Additionally, some of the students were instructed to jot down summaries at several points during the class. Before the final quiz, this subgroup was given time to refer to their notes while visualizing successfully completing the tasks. Such retention enhancement activities significantly improved task performance. But c’mon, isn’t learning spreadsheets different than tackling niche coding problems? And doesn’t the rise of Web-based training basically blow the model out of the water? “I would expect the findings to generalize to specialists using specific software development tools and skills,” Davis says. Translation: It would work for AJAX training, too. As for the Web, Davis sees it as potentially complementary. “One interesting wrinkle is that Web-based instruction could employ behavior modeling and retention enhancement training.” Davis’s work is a reminder that pedagogy matters as much as programming prowess. Still, the issue of just what to make of Web-based training is impossible to avoid. (I know Google returns scads of everything, but a search for “Web-based training,” with the quotation marks, yields roughly 3,270,000 results.) According to the most recent American Society for Training and Development Annual State of the Industry report, 28 percent of training was delivered via learning technologies in 2005, up from 24 percent the previous year. Geoff Koch covers science and software from Lansing, Mich. Seen any training Webcasts worth more than a passing glance lately? Let him know at koch.geoff@gmail.com. www.stpmag.com •

35


Best Practices “E-learning increases a few percentage points each year,” says William Rice, a New York City–based trainer and author of the book “Writing Successful Software Classes: A Plan for Course Developers and Training Managers” (Lulu, 2004). Today, tools such as Qarbon’s ViewletBuilder and Adobe’s Captivate, both of which contain features that weren’t readily available back when I was suffering occasional bouts of Intel’s Folsom training blues, make it relatively easy to rapidly develop e-learning courses. “There is a trend toward shorter, Webbased, just-in-time training sessions,” says Rice. Beyond authoring tools, says Marcus MacNeill of Austin, Tex.–based Surgient, technologies such as Web conferencing and virtual labs will eventually make it possible to decouple course material from delivery methods once and for all. “This is precisely the problem with software training: Hands-on labs are inherently tied to classroom training

because that’s where the infrastructure is deployed,” says MacNeill, director of product strategy for Surgient, which specializes in virtual labs.

Up Close and Personal However, I’m wary of any technology that completely obviates the need for face-toface interaction. Tar me a hopeless Luddite, but I consider in-person instruction to be a nuanced human transaction with some well-evolved advantages compared to, say, some snazzy future version of Webcasts. Nearly a decade later, I still remember my Intel trainer, Dave, as much for his tragicomic SAP-implementation war stories as for his diligence in working through the course material. I also recall his persistence in repeatedly working through a particularly knotty factoryloading problem that had the class stumped. And then there was his offer during our final class lunch at Chili’s— many of us would subsequently take him up on it—to contact him with any ques-

tions we encountered upon graduating from sandbox to multibillion dollar supply chain. “There’s really no substitute for interacting with people who have a fire in their belly and are thought leaders in their domain,” says Richard Arneson, a programmer with aQuantive in Seattle who’s recently worked with several trainers as part of his company’s embrace of agile practices. Arneson says experiences with such trainers are “more than just training; they’re important waypoints in the development of my thought about how engineering is done.” My weeks with Dave were a waypoint in my understanding of global technology business. Today, in a world of tech commoditization and countless demands on one’s schedule, the best advice may be to spend some time evaluating options before making the training leap. After all, we all know how long it takes to forget an utterly milquetoast training experience. ý

Index to Advertisers Advertiser

URL

Agitar

www.agitar.com/learnmore

Axosoft

www.axosoft.com

Bredex

www.bredexsw.com

33

Enerjy

www.enerjy.com/visible

33

Critical Logic

www.critical-logic.com/products/TMXwrx

39

Instantiations

www.instantiations.com/rcpdeveloper

ITKO

www.itko.com

Parasoft

www.parasoft.com/Jtest

3

Seapine Software Inc.

www.seapine.com/st&p

4

Software Security Summit 2007

www.s-3con.com

28

Software Test & Performance

www.stpmag.com

34

Software Test & Performance Conference 2006

www.stpcon.com

Test & QA Report

www.stpmag.com/tqa

36

• Software Test & Performance

Page 40 2

6 22

20-21 37

OCTOBER 2006


Don’t Miss Out

Test & QA Report e-newsletter! On Another Issue of The

Each FREE weekly issue includes original articles that interview top thought leaders in software testing and quality trends, best practices and Test/QA methodologies. Get must-read articles that appear only in this e-newsletter!

Sign up at: www.stpmag.com/tqa


Future Future Test

Test

Addressing Application Manageability IT and development organiteam to decide what inforzations traditionally operate mation to collect. In the past. independently with minimal we relied on our end users or collaboration. Coupled with QA department for detection the lack of shared tools, this and our log files for diagnossituation drives up the cost of tic information. Lowering our problem management and management costs requires detracts from the end user us to move toward a more experience. To reduce the proactive approach. To do cost of problem resolution this, we must first understand and improve our application how the information will be MIke Curreri performance, these walls interpreted by its consumers must come down. and how to standardize the presentation Application production failures are across applications. everyone’s problem. Industry statistics indiWrestling With Requirements cate that problem management is by far Addressing application manageability starts the biggest expense we face across the appliearly in the life cycle, with requirements. cation life cycle. Business analysts and stakeholders must However, our industry is recognizing envision how an application will effectivethe need for new methodologies, tools and ly be managed once in production. This processes to address application manageincludes defining: (1) business requireability. Microsoft, for example, is promotments that outline the boundaries of accepting the Dynamic Systems Initiative, while able application behavior; (2) how to detect IBM is embracing Autonomic Computing. deviations; and (3) what corrective actions CIOs worldwide have reported major are required in order to address the probincreases in budgets for application perlem and maintain business objectives. formance–monitoring initiatives. Application architects and development Each of these initiatives relies on the teams can leverage manageability requiremanageability of applications early in the ments as the basis for designing a health life cycle, and the proactive monitoring of model for an application. A health model those applications’ health in production. should describe the state transitions withThis is a departure from our tendency to in an application and define the detection collect requirements, design, develop, test mechanisms necessary to indicate that a and throw the application over the wall into transition is occurring successfully. production. Instead, the trend is toward Depending on the type of transition, diagdesigning, standardizing and communinostic information may also be collected cating around a total application manto assist in diagnosing and correcting any agement approach. deviation. Traditionally, detection and diagnostic Health models must capture informainformation have been a development tion that is both meaningful to the conexercise that centers on capturing inforsumer and provides actionable results. In mation to a log file or publishing to the sysmost organizations, the IT department is tem event log. It’s left to the development

38

• Software Test & Performance

responsible for managing production applications.

A Health Model in Action For a sample of the process, imagine that we have an application that requires access to a file on the file server. What happens if the IT department changes a security policy that causes an access-denied error? First, our state transition rules may define this problem as critical and cause it to manifest itself as a critical indicator to the IT department. Many IT management systems represent health state as red (critical), yellow (warning) or green (healthy). In this case, our health model rules should roll up application health into this type of classification and properly notify the team. Supporting information should also be collected to indicate the type of problem, specific information about the instance of the problem, and steps to resolve the error. For our example, we may include the specific file being accessed, the security error and the exact permissions required to restore normal application behavior. Up to this point, we’ve discussed processes for documenting and designing models for known application problems. But what do we do for unaccounted situations? We can choose from many solutions and strategies to mitigate risk for unidentified application problems. Infrastructure monitoring and management solutions provide a framework for monitoring services, networks and applications. Application management solutions provide application-centric monitoring and root-cause diagnostic information. Combining these tools with custom application health information provides a comprehensive view into known and unanticipated problems. Incorporating application manageability into every phase of the life cycle can dramatically reduce costs. New tools and methodologies will improve our ability to manage complex systems and applications. Software companies are designing in better manageability and auto-healing capabilities to lower costs and increase availability. Applied to custom application development and deployment, these concepts can boost the business bottom line. ý Mike Curreri is CEO of AVIcode, an application monitoring and diagnostic solutions provider based in Baltimore, Md. OCTOBER 2006




Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.