Putting Data into SOA by The Ken Orr Institute

Business Intelligence Vol. 7, No. 11

Putting Data into SOA: Data Virtualization, Data Buses, and Enterprise Data Management by Ken Orr, Fellow, Cutter Business Technology Council

One element in all of the discussions of service-oriented architecture (SOA) implementation has been neglected â&#x20AC;&#x201D; data. This Executive

Report highlights the major dimensions of what I consider SOA architecture, focusing heavily on the critical importance of data architecture in SOA governance, planning, and implementation.

Cutter Business Technology Council Rob Austin

Ron Blitstein

Christine Davis

Access to the

Experts

Tom DeMarco

Lynne Ellyn

Jim Highsmith

Tim Lister

Lou Mazzucchelli

Ken Orr

Ed Yourdon

About Cutter Consortium Cutter Consortium is a unique IT advisory firm, comprising a group of more than 150 internationally recognized experts who have come together to offer content, consulting, and training to our clients. These experts are committed to delivering top-level, critical, and objective advice. They have done, and are doing, groundbreaking work in organizations worldwide, helping companies deal with issues in the core areas of software development and agile project management, enterprise architecture, business technology trends and strategies, enterprise risk management, business intelligence, metrics, and sourcing. Cutter delivers what no other IT research firm can: We give you Access to the Experts. You get practitioners’ points of view, derived from hands-on experience with the same critical issues you are facing, not the perspective of a desk-bound analyst who can only make predictions and observations on what’s happening in the marketplace. With Cutter Consortium, you get the best practices and lessons learned from the world’s leading experts, experts who are implementing these techniques at companies like yours right now. Cutter’s clients are able to tap into its expertise in a variety of formats including print and online advisory services and journals, mentoring, workshops, training, and consulting. And by customizing our information products and training/consulting services, you get the solutions you need, while staying within your budget. Cutter Consortium’s philosophy is that there is no single right solution for all enterprises, or all departments within one enterprise, or even all projects within a department. Cutter believes that the complexity of the business technology issues confronting corporations today demands multiple detailed perspectives from which a company can view its opportunities and risks in order to make the right strategic and tactical decisions. The simplistic pronouncements other analyst firms make do not take into account the unique situation of each organization. This is another reason to present the several sides to each issue: to enable clients to determine the course of action that best fits their unique situation. For more information, contact Cutter Consortium at +1 781 648 8700 or sales@cutter.com.

Putting Data into SOA:

Data Virtualization, Data Buses, and Enterprise Data Management BUSINESS INTELLIGENCE ADVISORY SERVICE Executive Report, Vol. 7, No. 11

by Ken Orr, Fellow, Cutter Business Technology Council Service-oriented architecture (SOA) — a paradigm for “design, development, deployment and management of a loosely coupled business application infrastructure.”1 Software module — “a small self-contained program that carries out a clearly defined task and is intended to operate within a larger program suite.”2 A CRITICAL POINT FOR SOA Service-oriented architecture (SOA) is at that awkward, adolescent stage in the technology lifecycle that all important technologies go through. When SOA was only a concept, thought leaders could debate about what it meant, but no one really

questioned how great it was going to be. Software was going to be broken down into a set of clearly defined, loosely coupled services that would be commonly available across the Internet via Webbased directories with services defined in common, understandable terms so that building Webbased applications would be as simple as plugging a lamp or coffeemaker into a standard electrical plug or fitting a mouse into a USB port (notice I resisted the temptation to refer to LEGOs). But now, as people have begun to develop larger and larger applications using SOA, some of the original quibbles about definitions are assuming greater importance, and some ideas that were initially so promising seem to be in question.

As we know from the work of researchers like Everett Rogers3 and Clayton Christensen4 and popularizers like Geoffrey Moore,5 most technology innovation tends to play out in predictable states. In a 2002 Cutter Executive Report,6 I introduced my own technology maturity curve that I developed in the 1980s: the guru gap curve (see Figure 1). Like most major technological achievements, to paraphrase computer scientist Herb Grosch, much of what is good in SOA is not new, and what’s new in SOA is not necessarily good. Component parts are not new, and modular design is not new. What is new is the ability to develop and deploy well-structured,

BUSINESS INTELLIGENCE ADVISORY SERVICE Management Interest

Guru Predictions

Guru Gap Actual Productivity

Productivity

Prototyping Phase

Engineering Phase

2-4

Deployment Phase 10

Time (years)

Figure 1 — Orr technology adoption curve (the guru gap).

Web-based applications that draw on enterprise data crying to get out. What is happening today is that we in fact are rediscovering things that were perhaps better understood in the 1970s and 1980s than they are today. The aim of this Executive Report is to understand the vital importance that data plays in an SOA strategy. To do that, we have to revisit the basic assumptions of SOA and give you some understanding of why those conditions exist — in other words, rethink.

they are to be successful. It is in this period that individuals and organizations start applying the technology to really large, complex problems and begin to more fully document their experiences. I believe the engineering phase closely describes the current situation with SOA: after years of hype and hyperbole, ever larger SOA applications are now being developed and deployed in more and more organizations to perform more and more important functions.

Today, SOA is at what I refer to in my technology adoption model as the “engineering phase.” Like Moore’s chasm, my engineering phase is a critical juncture that technologies have to get past if

As in most things, there is good new and bad news. The good news is that individuals and organizations all over the world are learning a great deal more about how to “do SOA” as well as gaining a

greater understanding of SOA’s strengths and weaknesses. The bad news is that SOA, like any serious technology, has not been as easy to implement as it was originally made out to be and it has a number of design challenges. Finally, because SOA turns out to have many moving pieces, people-oriented elements like collaboration, coordination, and governance are seen as increasingly important. For example, one of the areas of contention in the new world of SOA is who has responsibility for the data quality and integrity of SOA applications: the developers or the database administrators (DBAs)? Often, there is a conflict between those who look at all data as XML and those that think of all data as relational tables. Some call this an “impedance mismatch,” but, as we see, it is really more of a semantic confusion. On the whole, I believe that SOA is a very good thing. I applaud the fact that SOA has brought modular design back into vogue, which is especially important to the quality of design of any large-scale application. I like the fact that SOA promotes standard functions and connectivity. But while it is a convergence of a number of technology trends,

The Business Intelligence Advisory Service Executive Report is published by Cutter Consortium, 37 Broadway, Suite 1, Arlington, MA 02474-5552, USA. Client Services: Tel: +1 781 641 9876 or, within North America, +1 800 492 1650; Fax: +1 781 648 1950 or, within North America, +1 800 888 1816; E-mail: service@cutter.com; Web site: www.cutter.com. Group Publisher: Chris Generali, E-mail: cgenerali@cutter.com. Managing Editor: Cindy Swain, E-mail: cswain@cutter.com. ISSN: 1540-7403. ©2007 by Cutter Consortium. All rights reserved. Unauthorized reproduction in any form, including photocopying, faxing, and image scanning, is against the law. Reprints make an excellent training tool. For information about reprints and/or back issues, call +1 781 648 8700 or e-mail service@cutter.com.

VOL. 7, NO. 11

www.cutter.com

EXECUTIVE REPORT SOA is at base an architecture, not a product or even a set of products. This has major implications for those implementing SOA. Like data warehousing, which was also an architectural shift when it first burst onto the scene, SOA will likely take the better part of another decade to succeed — if it is going to succeed. SOA continues a process that has been going on for at least the last 25 to 30 years in software development: the process that involves pulling apart the “independent” (orthogonal) dimensions of software. This process has allowed designers to be able to concentrate on one aspect of a complex system at a time, while holding the other dimensions constant: a process borrowed from other mathematical, scientific, and engineering activities throughout history. The purpose of this Executive Report, then, is to highlight the major dimensions of what I consider SOA architecture, focusing heavily on the critical importance of data architecture in SOA governance, planning, and implementation. We begin the report with a discussion of the separating of independent dimensions of software and why it matters to SOA developers. We then talk about the fundamental nature of services and the different kinds of services. The report then looks at software buses before addressing the most important dimension of SOA — that of enterprise data management and how it complements

Traditional Application Architecture

Security Component

Workflow Component

Presentation Component

Base Application

Reporting/BI Component

Database Component

Transaction Component

Business Rule Component

Component-Based Application Architecture

Figure 2 — Breaking up the “big ball of mud.”

SOA. The report then tackles the issue of reusability before concluding with a list of issues you need to be aware of regarding the road to SOA. TEASING APART THE “BIG BALL OF MUD” As introduced by Brian Foote and Joseph Yoder: A BIG BALL OF MUD is haphazardly structured, sprawling, sloppy, duct-tape and bailing wire, spaghetti code jungle. We’ve all seen them. These systems show unmistakable signs of unregulated growth, and repeated, expedient repair. Information is shared promiscuously among distant elements of the system, often to the point where nearly all the important information becomes global or duplicated. The overall structure of the system may never have been well defined. If it was, it may have eroded beyond recognition.7

In the beginning, individual programs and systems resembled a big ball of mud (see Figure 2). All

the functionality was contained in a single framework: dimensions like reporting, database, transaction processing, presentation, workflow, security, and business rules. Over time, these major dimensions have been teased apart so that they can be developed (and modified) independently. The most important of these independent dimensions (in roughly the historical sequence that they were identified and isolated) include: The reporting/BI dimension. In the earliest days of computing, developers recognized that all report programs followed pretty much the same hierarchical pattern. Report generators were the first of all the independent dimensions to be widely identified. The first report generators appeared in the early 1960s and 1970s and led to fourth-generation languages (4GLs) and a whole host of BI tools.

VOL. 7, NO. 11

4 The database dimension. The recognition of the database as an independent dimension occurred almost as early as that of reporting. The first stage of database management occurred with the introduction of physical and then logical file systems that made it possible for developers to access physical data without knowing the physical location or internal format. The first real database management systems occurred in the early 1960s. Network, hierarchical, and inverted file databases were popular in the 1960s, 1970s, and early 1980s but were overtaken by relational databases that first began to emerge in the late 1970s and early 1980s. In recent years, there has been increasing efforts to support object databases and XML structured data. The transactional dimension. The need to separate out the transaction component in large online applications was also recognized in the early 1960s, with the first and most popular transaction monitoring system, IBM’s CICS, being introduced in the late 1960s. With each new generation of IT platforms (mainframes, client-server, Web-based, etc.), there has been a recognition of the need for some form of transaction processing capability. The security dimension. Security became an issue with the introduction of online systems in the mid-1960s. The VOL. 7, NO. 11

BUSINESS INTELLIGENCE ADVISORY SERVICE introduction of LANs/WANs and the Internet and the evolution of viruses, worms, spam, and other forms of electronic attack have greatly magnified the importance of security management. The presentation dimension. The user interface began with the introduction of online systems and remained relatively constant until the introduction in the late 1970s and early 1980s of Windows-based user interfaces. Xerox introduced one of the first true GUIs in the early 1980s with its Star system, but it was Apple with the introduction of the Lisa and the Mac that greatly popularized the capabilities of sophisticated user interfaces. The introduction of the Web, the browser, HTML, and multimedia has produced the GUIs that we know today. In addition, the constant introduction of new wireless platforms (PDAs, cell phones, etc.) has led to the need for sophisticated GUIs targeted at a wide variety of devices. The workflow dimension. Workflow was initially introduced in the late 1970s and early 1980s largely to support image/document management. As organizations moved increasingly to electronic forms of communication, there was a need for some way to manage the routing of those electronic documents. By the mid-1990s, with the introduction of LANs and the Internet, worldwide

electronic routing became a reality, and organizations looking to implement business process management began to focus on workflow for all sorts of processes. The business rule dimension. Business rules have always represented one of the knottiest problems in application development. The early business rule engines were introduced in the 1980s based on artificial intelligence technology. In recent years, there has been a rediscovery of the importance of business rules, and exciting research has been working to make business rules as independent as, say, database management. This breaking down of these individual dimensions of software has taken decades and is still going on. This process has made it possible for developers to focus their concerns without having to worry about everything at once. Unfortunately, since many of the dimensions have been identified at different times, a great many legacy and other solutions each have their own variations of the other dimensions. For example, some database management systems have their own business rule engines, while various transaction systems have their own process (workflow) engines. One of the problems, then, is that tools developed to solve one dimension often expand to encompass one or more of the other dimensions

www.cutter.com

EXECUTIVE REPORT resulting in confusion, conflict, and added complexity. The purpose of introducing this classification of software dimensions here is that SOA, if used correctly, touches in different ways on all of these dimensions. In particular, most SOA applications have to deal heavily with the following dimensions: presentation, business process (workflow), business rules, and database (see Figure 3). What Figure 3 shows is an idealized environment (architecture) in which the business process is initiated from the presentation layer (i.e., a process instance is created) and then the business process layer controls (orchestrates) the workflow between various “activities” done by specific “actors” (organizations, roles, or systems). The business activities in these activities call (trigger) application components that, in turn, access and return data to the application components that then perform business activities and invoke business rules to operate on that data. Finally, the manipulated data is displayed by the presentation layer. The most important thing here is what is called the “separation of concerns.” The presentation layer knows only what it sends and receives. In turn, each activity within the workflow only has access to the small amount of process instance data (e.g., customer ID, order ID, and so on, for

Presentation (UI Services)

Business Process (Workflow Services)

App A

App B

App C

Applications (Base/BR Services)

Information (Data Services)

Figure 3 — The relationship between UI (presentation), business process, base/business rules, and data management.

a sales order process) that it needs to be concerned with. The activity knows what it receives, the data layer that it uses, and the output it sends back. The data layer knows only what data is requested or needs to be updated (what we refer to as data views). This environment provides a contextual framework in which the various component parts are largely isolated from the other dimensions as much as possible. Old hands at large-scale design and architecture may object that most of this is not new or relevant to SOA. While the first objection is largely true (this overall architecture is not entirely new), the second is not; this kind of architecture is, in fact, quite relevant to SOA. Indeed, nothing in SOA

obviates the need for an overall framework where the nearly independent dimensions are carefully kept cleanly apart. The reason why this is important to SOA is that it is easy for people — in their zeal to adopt a new paradigm — to forget where they are and what they’re trying to accomplish. In the end, we’ll use a variation of Figure 3 as a model for understanding how to develop various kinds of services within an SOA framework. The dimensions and framework shown here represent decades of hard work, insight, and trial and error. Each new generation of software architects and tool developers seems to be prone to forget why some of these dimensions are important, believing somehow VOL. 7, NO. 11

BUSINESS INTELLIGENCE ADVISORY SERVICE

that because we have new technologies we can skip over (or omit) one or more of these dimensions. In my own work and in reviewing problem causes in developing very large systems, I’ve found that not having a solid dimensional framework makes it harder (sometimes impossible) to come up with good, stable, longterm solutions. Eventually, those dimensions overlooked come back to bite developers and ultimately users. In the next section, we talk about services. While I know that readers have likely been exposed to the basic definitions of services and SOA dozens, even hundreds, of times, we cover these definitions here to show that some of the problems that people are having with SOA are caused, in part at least, from starting with the wrong (or perhaps wrongheaded) definitions. THE FUNDAMENTAL NATURE OF GOOD SERVICES Well-Defined, Self-Contained, and Independent

A service is a function that is well-defined, self-contained, and does not depend on the context or state of other services. — Douglas K. Barry8

Most of the world’s big fights are over homonyms. I say “object,” and you say “object,” but we don’t mean the same thing; I say “methodology,” and you say “methodology,” but we don’t VOL. 7, NO. 11

mean the same thing. We talk and talk, but we don’t communicate. If the discussion is important, arguments ensue. If we’re lucky, we figure out we’re talking about different things; if not, we keep quarrelling and maybe start a war. And as certain concepts become more widely used, there are increasingly heated discussions. Some of the homonym problems are honest mistakes introduced by simple misunderstandings, while other problems are more malicious and are frequently caused, in the world of technology at any rate, by an attempt to relabel an existing product or approach so as to cash in on a new fad (e.g., “new, improved with SOA!”). Currently, “services” and “serviceoriented architecture” are hotly contested words and phrases. Not surprisingly, there are a number of different meanings, so we’re going to try to come up with some definitions that, I hope, will help us have some perspective about how to create an enterprise SOA environment. Let’s start by talking about services. Services are an attempt to create a component-based, virtual, distributed software development environment. One of the goals, it seems to me, is the desire to build (construct) software products for the Web (remember “Web services” came before plain-old “services”) in a completely transparent way.

As defined on SearchSOA.com: Web services (sometimes called application services) are services (usually including some combination of programming and data, but possibly including human resources as well) that are made available from a business’s Web server for Web users or other Webconnected programs. Web services range from such major services as storage management and customer relationship management (CRM) down to much more limited services such as the furnishing of a stock quote and the checking of bids for an auction item. The accelerating creation and availability of these services is a major Web trend. Users can access some Web services through a peer-topeer arrangement rather than by going to a central server. Some services can communicate with other services, and this exchange of procedures and data is generally enabled by a class of software known as middleware. Services previously possible only with the older standardized service known as Electronic Data Interchange (EDI) increasingly are likely to become Web services. Besides the standardization and wide availability to users and businesses of the Internet itself, Web services are also increasingly enabled by the use of the Extensible Markup Language (XML) as a means of standardizing data formats and exchanging data. XML is the foundation for the Web Services Description Language (WSDL).9 www.cutter.com

EXECUTIVE REPORT Notice that “Web services” are defined here using the term “services” itself, that is, a case where a phrase is defined in terms of itself — a common but frequently irritating technique. Forget that for a moment and think of Web services in terms of functions, where a function is defined as a transform that maps a set into a set of inputs into outputs.10 A service in this sense then is something that “does something.” In process terms, a value-added activity (aka service) is one that adds value (i.e., does something useful). Programming services or functions, for example, often return some output value(s) based on certain input(s). A search function, for example, returns a list of answers to a query or question (see Figure 4). One of the common definitions of services is that they are: (1) identifiable, (2) have well-defined inputs and outputs, and (3) are composable (i.e., multiple services can be strung together to make a larger service). The first notion that services are identifiable means that they have unique, recognizable names. One of the touted features of SOA is that it will be based on a large number of uniquely identifiable services and there will be online directories of common services (e.g., “process order,” “ship goods”) that a software developer can search for, choose, and then connect (compose) via

well-defined (standard) inputs and outputs.

protocols (see Figure 5). While this framework represents philosophically the basis for a serviceoriented future, the idea of generalized SOA directory services seems to be less advanced than that of internal directories, used by SOA development teams within individual organizations. Perhaps the presence of generalized, Internet-wide SOA services will eventually be achieved, but the history of reuse is far more complex and difficult than most

Indeed, included in this idea is that these directories would contain identified services, but they would also refer to “service providers” that would host (run/execute) the service on behalf of the “service customer,” and all of the communication between the various parties (service provider, directory, and service consumer) would be carried out via standard message

locations {from, to}

Routing Service

route segments

Figure 4 — A basic service.

Service Directory Service Query Service Description

Query Responses

Service Request Service Consumer

Service Response

Service Provider

Figure 5 — An SOA directory framework.

VOL. 7, NO. 11

BUSINESS INTELLIGENCE ADVISORY SERVICE

proponents admit. We will return to this idea of reusability a bit later in this report. At any rate, SOA is currently presented as a utopia of common, clearly identified, easy to find, easy to put together, stateless services. The technology involved is pretty well in place, but using this technology is much more difficult than one might imagine, with most of the difficulties stemming from issues having to do with semantics and understanding. Suppose we extend our routing service in Figure 4. Figures 6a and 6b, for example, show a slightly more extended picture of how independent services can be combined to create a larger service. Figure 6a shows the routing service defined in terms of inputs and outputs in Figure 4, showing individual input and output data, while Figure 6b shows a “mapping service” that takes routing

“From: 104 Woodlawn, Topeka, KS” “To: KCI Airport”

segments and puts them out as a set of line segments overlaid onto a map. Figure 7 shows a hierarchical arrangement of these services nested within a “routing and mapping service.” This is how SOA is supposed to work, at least in theory. A complex function is broken down into a series of simple functions or, conversely, created by combining a series of simpler functions into a larger, more complex one. However, in this case, this power comes with considerable development cost. Mapping software systems such as MapQuest or Google Maps are the result of decades of R&D involved in coming up with digital mapping standards, digital (network) map databases, and software to transverse these digital maps to come up with routes and display the result. Hidden from view in this service view of the world (or that of a series of

programming APIs for that matter) is the underlying databases and metadata. To show this, I’ve chosen to use a diagramming standard called IDEF011 in Figure 8. In this representation, the inputs still come in from the left and the outputs still leave from the right, but we’ve added a resource (in this case, database) coming in from the bottom and control (metadata) coming in from the top. Notice that the IDEF0 diagram has some interesting similarities with the architectural diagram in Figure 3. Indeed, services, functions, and IDEF0 activities all share many of the same properties; they have inputs, outputs, databases, and metadata (control) information to function correctly. This is no accident; the nature of IDEF0 process analysis and that of SOA analysis and design turns out to be the same — ensure a careful, correct modular design.

Mapping Service

Routing Service

Figure 6a — Routing service with sample input and output.

Mapping Service

Figure 6b — Mapping service with sample input and output.

VOL. 7, NO. 11

www.cutter.com

EXECUTIVE REPORT Modularity, Coupling, Cohesion, Data Flow, and Functional Decomposition

Where the hot topics of software development today are “services,” “open,” and “agile,” in the 1970s and 1980s the hot topics were “structured,” “data structured,” “modular,” and “data flow.” Even back then, systems were becoming larger and more complex, and it was becoming clear that software designs had to enforce some more rational form of modular architecture.

locations {from, to}

Indeed, in order to build and test any very large system, it is necessary that the system be broken down into pieces that can be defined, implemented, and tested independently of one another, that is, modular. Therefore, all large systems were modular, and so being modular was not enough to guarantee success! What was important, then, was which kinds of modular systems were the easiest to develop, test, and modify — in other words, what was the best kind of modularity? The answer about what were the key characteristics came from some

Routing Service

route segments

groundbreaking work by Larry Constantine and Cutter Fellow Ed Yourdon12 based on the analysis of the maintenance history of large systems. What they found was that the two key attributes of good modules were “coupling” and “cohesion.” Those modules that were the cheapest to maintain were those that had low coupling and high cohesion. As it turned out, there came to be an appreciation about that time that modules with these attributes were easier to develop and test as well.

Mapping Service

route map

Routing and Mapping Service

Figure 7 — Adding routing and mapping services to produce a routing and mapping service.

metadata

locations {from, to}

Routing Service

route segments

VOL. 7, NO. 11

10 Low coupling means that modules are not dependent on the functioning of any other modules that aren’t submodules of the module itself. To repeat the quote at the beginning of this section, a good service does not depend on the context or state of other services (modules). This meant that pieces (modules) could be worked on independently, and teams could be assigned to work on these independent modules without constantly meeting with every other team. As a result, low coupling made both development and maintenance easier; programmers or teams could focus on just their tasks at hand. One way to ensure that modules were both independent and had low coupling was to focus on modules that had only a single function, that is, they had high cohesion. High cohesion means that a good module does only one (or a small number of) thing(s). Such modules do not try to bundle a large number of functions within a single module.13 Over time, high cohesion was shown to represent a form of functional design that focused on defining modules based on generating outputs from inputs — in other words, something called data flow design. Data flow design is a method of representing software designs by breaking down problems into sets of data flows by a process of functional decomposition (aka “topdown design”). Designers would develop a picture of a system in VOL. 7, NO. 11

BUSINESS INTELLIGENCE ADVISORY SERVICE terms of the high-level inputs and outputs from the system. Then the highest-level design would be broken down level by level with the definition of inputs from each higher level acting as a constraint on the next level of design. Ideally, this process would yield modules and submodules that were consistent and highly cohesive. In practice, the process was much more complex, but the concept of functional decomposition was easy to understand and teach and greatly advanced the movement toward a more functional view of software system design. Interestingly, the ideas of modularity, coupling, cohesion, data flow, and functional decomposition have come back into vogue as a result of the increased interest in process (workflow) modeling and SOA. Services resemble wellformed modules more than they resemble objects. In order to create good services that can be strung together to create more powerful services, there needs to be increasing thought given to inputs, outputs, database, and controls. But in addition to the idea of modularity, there are also a couple of other ideas that are extremely valuable in the search for an enterprise strategy for SOA: structured and data structured design, as we discuss next. Structured and Data Structured Design

In the 1970s, software researchers came up with some radical but simplifying ideas about the

organization of programs. Programs that restricted themselves to a form of highly organized structures based on a very few fundamental logical forms turned out to be easier to develop and debug than problems where the flow of control jumped willy-nilly from one piece to another. This discipline became known as “structured programming.” Software researchers in Italy had discovered that all programs could be constructed using only a handful of logical constructs, but it was software researchers like Edsger Dijkstra, Harlan Mills, Niklaus Wirth, and others that made “structured programming” a major movement. These basic structures included only sequence, alternation, and repetition initially, but subsequent work discovered that there were two other structures that were also required for most real-time and complex modeling and design: concurrency and recursion. Sequence, of course, meant that portions of a program could be organized one piece after another. This was the most fundamental kind of structure where Part A was executed before Part B, and Part B was executed before Part C (i.e., beginning, middle, and end). Alternation meant things could take only one leg or the other of an if-then-else conditional construction.14 Repetition meant that subpieces of a program could be repeated, returning to the start at a starting point, again based on some condition. Concurrency www.cutter.com

EXECUTIVE REPORT meant that two (or more) pieces could be initiated at the same time and processed in parallel, while recursion meant that programs could be broken down into a structure that actually invoked another instance program itself as a subroutine (module) of itself. Structured programs (and systems) with modules that were loosely coupled and highly cohesive turned out to be much easier to manage intellectually and as a result easier to debug and maintain. At a systems level, structured became synonymous with data flow design. As people began to constrain program structures to the basic set, multiple software researchers noticed that in a variety of circumstances the structure of the data and the structure of the program that operated looked very much alike. That identity spawned the idea of “data structured design.” Data flow design promoted thinking about modular design in terms of the flow (transforms) of data. Data flow design allows the outputs of activity A to be connected to become the input to activity B, and the output of activity B becomes the input to activity C. Data flow design also supported the idea of functional decomposition in which a complex function with well-defined inputs and outputs could be broken down into a set of connected, lower-level functions, the first of which had the same input as the higher-level function, and the final one having

the same output as the higherlevel function. The route and mapping function in Figure 7 could be seen as an example of functional decomposition. Data flow design, it turns out, is a very old form of design. It is the theme of one of the most important contributions that Unix made to software development in its earliest days — the concept of “pipelines” or “pipes.”15 A number of common functions were implemented using standard implementation of common text-processing functions (called “filters”), which could be connected in a variety of ways to produce different outputs, much as SOA gurus predict that it will be possible shortly using common services. So how does this discussion of coupling, cohesion, data flow, and data structure help us understand SOA and, more importantly, understand the role that the enterprise data layer plays in SOA? The answer is that, in a way, SOA design (at least the top-down design version of SOA design) can be thought of as a subset of data flow design. While SOA is often thought of as a natural extension of OO and component-oriented design, it can just as easily be viewed as the rediscovery of modular design since services and modules are so outwardly similar. Many of the things that were learned about data flow design are relevant once again. From this viewpoint, it is interesting to view many of the current discussions

about how to do SOA design that have already been discussed in a similar but different context. From my own practice, I have learned in designing complex systems that you need to come up with a simple architecture that isolates the various components in each independent (quasi) dimension — an architecture that is both explainable and understandable. SOA clearly involves all the dimensions described in Figure 3, but we’re going to simplify our problem by focusing on just three of those dimensions: process (workflow), services (application/business rules), and data (information). GET ON THE BUS(ES): THE PROCESS BUS, THE APPLICATION BUS, AND THE DATA BUS In computer architecture, a bus can be defined as follows: A bus (bidirectional universal switch) is a subsystem that transfers data or power between computer components inside a computer or between computers, and a bus typically is controlled by device driver software. Unlike a point-to-point connection, a bus can logically connect several peripherals over the same set of wires. Each bus defines its set of connectors to physically plug devices, cards or cables together.16

The idea of a software bus that is being talked about today is clearly inherited from computer

VOL. 7, NO. 11

12 (electronic) hardware. Historically, buses have provided flexible structures for various electronic components to communicate with one another. Such buses normally represent standards agreed upon by the major players in the industry. The key components and the attendant different bus structures are often referred to as “the hardware architecture.” Armed with standard definitions that describe how one connects with a given bus, manufacturers’ engineering teams can design components or peripherals relatively independent of the organizations designing the other components. For decades, software engineers have striven to emulate the hardware folks. Today, more than at any other time, buses are the rage in the software world. While my colleague Andy Maher says he dislikes the term “bus” when associated with software because it gives people the illusion that there are real (physical) as opposed to virtual connections involved, I, on the other hand, believe that buses, even if they are logical fictions, are very useful fictions. Software buses make it possible to lay out elegant frameworks from which it is possible to build logical components and connect them in standard ways, without having to worry much about the internal (physical) characteristics of the pieces. You can see from the earlier discussion about teasing apart the various quasi-independent software dimensions how useful software buses can be in this endeavor. We VOL. 7, NO. 11

BUSINESS INTELLIGENCE ADVISORY SERVICE are going to use this approach to propose a solution for the ideal architecture of the ultimate “software bus.”17 In order to think about a distributed, Web-based, Enterprise Service Bus (ESB), we are going to need at least three different sub-buses by the time we’re done. The three most prominent are: (1) a process bus, (2) an application/business rule (service) bus, and, of course, a (3) data bus (see Figure 9). The Enterprise Process (Workflow) Bus

Figure 9 describes a situation in which various activity services are executed by an overall business process (workflow) management scheme that allows for quasiindependent services to be hooked together via a workflow network. One can think of a workflow management system as a process bus that communicates the minimum information necessary between activities (outputs, inputs) and key process context information. For example, in a vehicle rental application the “rental agreement number” and “rental location” are likely to be passed as inputs to each individual service so that the basic process context can be inferred; information about decisions also need to be passed from activities to decision points (e.g., “credit = ‘approved’”). With the increasing importance of automated business process

management, there has been considerable headway in developing common business process languages for modeling (BPML) and execution (BPEL). One of the continuing headaches in process management is the likelihood of having more than one workflow process management engine involved. Increasingly, COTS packages have adopted a standard workflow engine to control the inner workings of their processes. Software vendor SAP, for example, which was one of the earliest adopters of a very strong process management structure, has its own Web-based process management scheme (NetWeaver), while IBM and Microsoft each have their own as well (i.e., WebSphere and BizTalk). Over time, standards like BPML and BPEL will increasingly make it possible for a smooth handover across major (and minor) vendors independent of their specific process bus implementation. A process bus, then, provides a standard harness, if you will, for connecting process activities together. Moreover, the process bus also provides the capability of maintaining what are called the “process instances” and, of course, of managing the activityto-activity flow. One of the enormously valuable secondary benefits of using process buses is that they collect a great deal of important status and performance data with little or no overt effort on the part of the people or systems that execute the individual www.cutter.com

EXECUTIVE REPORT

Process Bus

Enterprise Process (Workflow) Metadata

Application Bus

Enterprise Service/ Business Rule Metadata

ProcessServices Services Process

Application Services

Data View Services

Data View Navigation Services

Data Bus

Data Metadata

Core Enterprise Data Services

Data Adapter Services

Figure 9 — The components of the service bus.

activities. It is always possible to know where “such and such a reservation” is in the vehicle rental system by tracking the process instance of the one, for example, with “rental agreement number = 123456.” (Figure 10 shows a series of process instances that have started at separate times for a sales order process; the activities (boxes) that are darker represent the current active activity.) Two terms that are commonly associated with business processes are “orchestration” and “choreography,” as described by Carol McDonald: ©2007 CUTTER CONSORTIUM

Orchestration defines the control and data flow between Web services to achieve a business process. Orchestration defines an “executable process” or the rules for a business process flow defined in an XML document, which can be given to a business process engine to “orchestrate” the process, from the viewpoint of one participant. Choreography defines the sequence and dependencies of interactions between multiple roles to implement a business process composing multiple Web services. Choreography describes the sequence of interactions for Web service messages — it

defines the conditions under which a particular Web service operation can be invoked. WSDL describes the static interface, and choreography defines the “dynamic” behavior external interface from a global view. BPEL4WS primarily focuses on orchestration, while WSCI focuses on choreography. With WSCI each participant in the message exchange defines a WSCI interface. With BPEL4WS you describe an executable process from the perspective of one of the participants.18

What the definitions above convey is that orchestration describes VOL. 7, NO. 11

BUSINESS INTELLIGENCE ADVISORY SERVICE

Cust omer

Accept and Pay

shipm ent inv oice

Order Entry order

Order Entry

preapproved Y es

preapprov ed credit

Credit Manager

entered order

Cust omer

Check Credi t

approv ed order

Accept and Pay

shipm ent inv oice Sales Manager

Allocate Goods

Order Entry order

Order Entry

preapproved Y es

shipping notic e

preapprov ed credit

Shop No

Ship Goods Credit Manager Accounting

entered order

Check Credi t

approv ed order billing notice

Bill Custom er

pay ment

Process Payment

Cust omer Sales Manager

Allocate Goods

shipm ent inv oice

shipping notic e

Acce pt and Pay

Order Entry Shop

Accounting

Order Entry

preapproved Yes

preapprov ed credit

Ship Goods

ACME Manufacturing -- Sales Order Process

order

billing notice

Credit Manager entered order

Check Credi t

Bill Custom er

pay m ent

Process Payment

approv ed order

Sales Manager

Allocate Goods

Shop

shipping notic e

ACME Manufacturing -- Sales Order Process Ship Goods

Accounting billing notice

Bill Custom er

pay m ent

Process Payment

ACME Manufacturing -- Sales Order Process

Figure 10 — Business process instance.

the high-level “logical” business process, whereas choreography describes the detail physical, dynamic messaging that needs to be sent, received, and monitored to ensure that the physical implementation mirrors what was actually intended in the logical design. History has shown that software tool vendors are ultimately able to hide the physical details of implementation from developers. The fact that there are still discussions about choreography of business processes indicates that there are still issues with implementation of workflow management over the Web. Designers should be extremely careful not to let clever programming dictate the longrange design of key processes.

VOL. 7, NO. 11

One of the key issues of enterpriselevel workflow management is the coordination of parallel, asynchronous activities across departmental, even enterprise, levels. Programming such activities has always been a tricky business. This is the domain of operating systems and real-time control systems. However, with the worlds of research, development, and experience in such systems, coming from these domains, developers will be less and less concerned with the physical details.19 Defining the business process is a collaboration between business users and business process analysts. Business process analysis and design provide the basis for “swimlane” models that, in turn,

provide the basis for the overall workflow network and the identification of the various activities within the model. While activities can be completely or largely manual, most business process activities in swimlane models today involve either individuals acting in specific roles using some computer application (e.g., “checking for vehicle rates,” “confirming a reservation”) or some completely automated activity (e.g., “calculate final bill”). The Enterprise Application/ Business Rule Bus (EAB)

The EAB is usually described as a way for services to communicate with one another over a distributed environment using standard messaging protocols. Within the www.cutter.com

EXECUTIVE REPORT context of our earlier discussion regarding the separation of dimensions, we are going to focus the domain of the EAB on just the business application and business rule portion of application development and what I call the data view services, though the EAB could be, and often is, seen as the bus to end all buses. Whereas the process bus is concerned about process orchestration, the EAB is heavily involved in the registering, locating, and accessing of data, and with providing services with the right data. The advantage of doing these functions over the Internet through common directories is that, as in the old mainframe environment, software changes don’t have to be downloaded to millions (billions) of client systems in order for users to have the latest updates. Changes can be made once and be available immediately to everyone on the Internet. But as always, the devil is in the details. If you take the millions of individual pieces of millions of applications written in dozens of languages (both natural and computer), then the possibilities for misunderstanding are naturally enormous. Therefore, the EAB within our design structure is broken into two functions: application services and data view services. Application services are the work to be done, while data view services are the data interface to the internal data (enterprise data) that is needed to execute the application services. In my view, ©2007 CUTTER CONSORTIUM

Figure 11 — Data view definition, vehicle availability query.

this should be the minimum data necessary to produce the right input from the input provided. The data view services should also be responsible for initiating the updating of the enterprise data as well. The definition and design of application services and the data view services should be the responsibility of the application service designers working with the business users and the business process analysts. There are a variety of ways of specifying such data views, but I recommend the form shown in Figure 11.20 A developer defining a data view service to support a business service doesn’t need to know the underlying source or structure of the

enterprise data but should concentrate on the data that is needed to make the service work and describe that data in terms that are best understood to the user. The Enterprise Data21 Bus

The third component of the Enterprise Service Bus, the enterprise data bus (EDB), parallels the enterprise application bus and provides the interface (linkage) between the individual data view services and the actual enterprise data needed to retrieve and update that information. In a very real sense, the data bus is an architectural representation of an enterprise-level, distributed, heterogeneous data management system. The idea here is for the EDB to operate at the enterprise VOL. 7, NO. 11

16 level in much the same way that a DBMS (e.g., Oracle, DB2, SQL Server, MySQL) manages data for a set of applications within the enterprise. It is hoped that market pressures will provide the impetus for the major vendors to begin to build EDBs that will interface not only with their own DBMSs but with all matter of legacy and external data as well. The EDB should be responsible for managing information for enterprise-level services to data across the enterprise, stored wherever it happens to be and in whatever format it happens to be in. The EDB is a way of hiding the physical location and physical structure of existing data, which currently resides in some legacy data store, internal data store, or external data source. The EDB allows developers and/or software packages in the enterprise application bus to ask for information structured precisely for their needs and lets the underlying EDB functions access that information and structure it appropriately. The EDB should be a twoway street, that is, it should: (1) provide access to pull data from the ESB; and also (2) update data supported by the EDB. If you look at how many enterprise data services actually work, they do a pretty good job today of accessing data but not nearly so good a job of supporting the update function. The middle layer of the EDB contains a logical (and/or physical)

VOL. 7, NO. 11

BUSINESS INTELLIGENCE ADVISORY SERVICE mapping of legacy data to common business semantic entities. What is referred to here as the core enterprise data is not unlike the operational data stores that occur in data warehousing implementations. In many SOA implementations today, this is supported almost entirely via metadata and temporary copies of data. BEA, which has a very good implementation that it refers to as “data virtualization,” allows for the implementation of the top two functions of the EDB. This is effective as long as the SOA is involved largely in business intelligence functions (today, estimates are that 90% of SOA functions are BI-related). However, as SOA is extended to other more transactional functions, there will be an increasing need for the middle layer of the EDB to be increasingly real. The lowest level of the EDB contains the “data adapters” that expose (attach) legacy data to the EDB. In enterprise architecture consulting work, my colleagues and I are constantly involved in trying to help clients figure out how to move from very old, legacy applications whose data still resides in obsolete database platforms. For the last two decades, organizations have attempted to replace these core legacy systems with some form of COTS application, with varying degrees of success. Not all of these attempts have been successful and even the successful replacement strategies have been traumatic.

Increasingly, organizations are taking a longer view and thinking about using services to replace legacy functions with Web-based, service-oriented applications while using a data strategy to slowly migrate the legacy data to more modern platforms. WHAT LIES BEYOND: ENTERPRISE DATA SERVICES In my discussions about and research into SOA, I have come to believe that a number of really important trends are, with SOA, coming together. I believe that the real future of SOA is to become the architectural framework, not just for new classes of “composite applications,” but for a broad class of enterprise applications. This is a trend that has been underway for some time. Over the past few decades, organizations have moved from a set of quasiindependent, “siloed” applications to more enterprise-wide, business process–driven applications and from application data management to enterprise data management. This architectural shift has occurred along a number of stages. By the beginning of the 1990s, for example, relational database researchers had developed enterprise-wide approaches to distributed database management. DBMSs with the capability to support two-phase commit and replication were increasingly available. At about the same time, the computing world shifted to client-server, distributed applications that would separate www.cutter.com

EXECUTIVE REPORT presentation components from application processing and database manipulation. And at the same time that organizations were beginning to unravel the problems of client-server and distributed computing, which were, by nature, multiplatform, enterprise-wide problems, the business intelligence community was hard at work on the “enterprise data warehousing problem” for integrating BI data across the entire enterprise. Over this period, data warehousing became a major component of a total enterprise data architecture driven by a growing need to: (1) integrate data from multiple operational systems for management

and marketing purposes and (2) to make it easier and faster to access that integrated data for BI purposes (see Figure 12).22 This process was greatly accelerated by the explosion of the Internet into corporate computing. Distributed computing, business process management, and data warehousing have all matured greatly since they were first proposed in the 1980s. In most cases, this has been a productive exercise. Organizations that have embarked on and built robust data warehousing architectures, for example, have been able to greatly enhance the ability of managers, analysts, professionals, and operational personnel to access critical enterprise data.

The introduction of the Internet has only amplified the kinds and types of data that are now available, and Web-based portals now make this enhanced data available to larger and larger numbers of users (e.g., customers, business partners, media). Over the next decade, organizations, whether they use systems run on their own computers developed by themselves or software vendors or they use hosted systems operated in a softwareas-a-service (SaaS) framework on cloud computing environments, will have to rethinking their data strategies. No matter whose programs an enterprise uses, the data is its asset, perhaps its most important asset. As more and

VOL. 7, NO. 11

18 more organizations realize just how valuable their data assets are, the more they are going to have to manage that data, not in a piecemeal fashion, but from an enterprise-wide standpoint. Enterprise data management is just that — managing all of the enterprise’s data. Because of the complexity of large organizations’ data, whatever scheme enterprises come up with, they will necessarily be distributed solutions. Developing an enterprise data management (i.e., EDB) solution within SOA represents a way to recapture the high ground in data management. If SOA is an enterprise function, then enterprise data management is as well. The EDB makes it possible to migrate underlying applications and databases without anyone knowing, which is really important, since in the next decade there will be increasing attention on rebuilding many enterprises’ core business applications as all the people who know those systems as well as the computing industry move to new platforms. MISREADING THE HISTORY OF ENGINEERING: REUSABILITY The ultimate objective of SOA, or any other software approach for that matter, is not reusability; the ultimate objective is better, cheaper, easier to maintain complex systems that support better business processes and a much higher degree of business agility. Reusability may be one of the ways to accomplish these VOL. 7, NO. 11

BUSINESS INTELLIGENCE ADVISORY SERVICE goals, but it is not the end in itself. Reusability ought to be the happy accident of good design and the use of component parts where those parts are available. In manufacturing, reusability is largely the result of the high cost of manufacturing the individual component parts. In general, the more parts of any given kind you produce, the lower the unit cost. Therefore, there is natural economic drive to use standardized parts wherever possible. But, in software, the cost of manufacturing software parts, after the first one is developed, is almost zero. Software costs all come from development and testing, often development of “almost” the same thing over and over again. There is no manufacturing line, just an engineering department. There are lessons to be learned from the last 50-plus years of software development. Those software components that have been most reused have been those that are well defined and well named. I suspect that the most successful program of reuse in software history was the Fortran scientific subroutine library. The reason that this program was so successful was that it was built on hundreds (actually thousands) of years of mathematical practice, and that provided a common set of functions that were precisely named and precisely defined. Mathematicians, engineers, and scientists throughout the world use the same symbols to mean

the same things and have been taught to use the work of others. In addition, their work is functional, so that one can use successive sets of operations to create new operators. Moreover, mathematical functions are complex and require exhaustive testing. All of these things lead scientists and engineers to seek out standard software and not try to build a better mousetrap. Commercial software has been far more successful in defining common data than common functions or components. Today, we see the successful “mashup” of functions on the Internet, but when analyzed, it turns out that underlying these mashups are very sophisticated, very complex data structures that have been worked out over very long periods of time. Any number of SOA programs that I have seen spend enormous effort trying to decide which “common services” to build, in the hopes that those services will dramatically increase their development of complex applications. History has not been kind to this approach. This approach has been tried without great success since the mid-1980s using first objects and then components as the rallying cry. History has shown that the most used functions are either very simple ones that are used thousands (millions) of times per day or very large ones that do something very complex that is difficult to define www.cutter.com

EXECUTIVE REPORT and debug. This report was generated by an observation that the thing that people seem to find the most useful from their SOA program is the common data services (e.g., “get customer info,” “get product status”). Services, as has been pointed out here, is a form of modular design, and the history of modular design is largely the one borrowed from building architecture, “form ever follows function.” If we do a good job of modular design, and do a good job of keeping the process, the activities, and the data straight, the reusable pieces will emerge naturally. POTHOLES ON THE ROAD TO NIRVANA For those organizations that are just starting down the road to SOA, this section addresses some of the things you need to be aware of in the process. Coarse-Grained and Fine-Grained Services

You may have noticed that SOA experts suggest that you focus on “coarse-grained” as opposed to “fine-grained” services. You may also see occasionally a suggestion that you try to stay away from applications with very high transaction volumes. All of this is about SOA’s current level of performance. Historically, there has been a significant overhead involved in making the magic of SOA happen. Therefore, things that get executed a lot, that is, fine-grained services, are to be ©2007 CUTTER CONSORTIUM

avoided. The piece of the SOA iceberg that lies under the water often involves accessing data from non-SOA legacy systems and making it available to Web serviceenabled applications. How can you make sure that you’re not developing fine-grained services that will overwhelm your performance? Well, from a design point, start with top-down design. As you get further down the process chain, have developers and database folks do estimates on the number of accesses that critical services may have during peak periods of use. Get advice from SOA experts on how to code those critical, fine-grained services. From a data access point, it means getting relatively large chunks of data when you do data access. Don’t access individual attributes if you don’t have to. The Cost of XML

SOA performance problems are exacerbated by the significant data redundancy involved with XML messaging. While XML advocates expound on the fact that XML messages are both computer and human readable, neither computers nor people much care about this feature. Computers don’t actually read the text part of XML documents except to scan for the “real data,” and people only read them a relatively few times, usually early in the development or when something goes wrong. On the other hand, millions of times per day XML

messages are handled by highvolume applications. What all this means in execution is an enormous amount of redundancy in messages (often five to 20 times the size of traditional flat encoded files), which requires bigger, faster pipes. In addition, because of its hierarchical nature, XML messages are harder to validate as well, which adds additional processing load. Since so much of SOA is based on XML, what should you do if you have lots of transactions? Well, the answer is to look at which interfaces are the most significant and to talk to experts about ways to remediate the problem. As the numbers of messages per day or hour or minute mount, large organizations are coming up with ways to communicate load and processing time. In some cases, this means encoding and transmitting the XML tags only at the beginning of a session, then using that information to compress the messages on sending and then expanding them on the receiving side. This adds parsing time at both ends, but specialized communications devices will help in this area in the future by building the compression algorithms into the microcode. The Complexity of SOA

Another performance problem in the world of Web services is the number of physical layers involved in moving data up and down the physical network that underlies VOL. 7, NO. 11

20 our Web-based applications and the resulting complexity. In researching this Executive Report, I found a very interesting report from IBM on visualizing the execution of Web Services,23 which describes a tool that IBM has developed to help its SOA developers track down performance problems. Just looking at some of the graphs in that report shows the practical problems that SOA developers are likely to have developing programs that involve large numbers of applications and services running on lots of different computers with different loads, all connected by many different telecommunications pipes running at different speeds. So what can you do if your performance is slow and you can’t figure out the reason? Well, looking at tools like IBM’s Web service visualization tool is one thing you can do. Other vendors are sure to have similar tools; find out what they are and get snapshots of how the services are going to behave on real applications. The Overriding Importance of Data

I strongly believe that what I call the enterprise data bus is critical to getting the maximum value of SOA over both the short and long run. Think of the EDB as your enterprise database manager. Make it as cleanly architected as you can and constantly go back and refactor the overall business

VOL. 7, NO. 11

BUSINESS INTELLIGENCE ADVISORY SERVICE semantic model whenever new data structures are needed or when new attributes are added to existing data views. So what can you do to develop an enterprise data service mentality? Well, one thing is to look at tools that can bridge the gap. One tool that I know called DeKlarit that works with Microsoft Visual Studio takes sets of structures like the one in Figure 11, builds a common data model, and automatically writes the navigation code to extract the data and presents it to the programmer in the way that I described earlier. THE GHOSTS OF CHRISTMAS PAST AND PRESENT: AD/CYCLE, THE IBM SAN FRANCISCO PROJECT, AND MICROSOFT’S PROJECT OSLO When everybody knows that something is so, it means that nobody knows nothin’. — Andy Grove, cofounder of Intel, in a 2005 interview with Fortune

Like so many things, SOA is more of a journey than a destination. In the late 1980s and early 1990s, I worked with IBM on a very ambitious project called AD/Cycle. AD/Cycle was IBM’s answer to automating the software process, what was called CASE at the time. IBM spent hundreds of millions on the project before it more or less slipped from the radar in the mid-1990s.

Not long afterward, IBM started another project called the San Francisco Project. It was also a major undertaking, this time to develop a business technology strategy to build a framework for building business systems in Java. It too disappeared a few years later after absorbing more than $100 million. Neither AD/Cycle nor the San Francisco Project were total losses for they produced tools that have found their way into the current Eclipse offerings; however, they did not produce the results expected. It turns out that coming up with an end-to-end solution — business strategy Æ working code (and back) — is an enormous undertaking, too much for even the largest, most sophisticated organizations in the world. Microsoft has also been working on its version of the end-to-end structure, called Project Oslo. Like AD/Cycle and the San Francisco Project, there is a great deal of emphasis on rapid, Web-based, model-driven development in the discussions of Project Oslo, but if experience is any guide, Microsoft will have more problems delivering than one is led to believe. This is especially true when one considers the sheer complexity of the existing IT systems and infrastructure. It is reassuring that everyone seems to be headed in the same direction. However, AD/Cycle, the San Francisco Project, and now Project Oslo all promised great

www.cutter.com

EXECUTIVE REPORT things. All of these attempts had high goals, had considerable buyin from other software providers and customers, and were widely touted in the technical press; and they all consumed (or will consume) hundreds of millions of dollars. Unfortunately, they didn’t constitute the revolution everybody had been promised. They did, however, create a much more complex and sophisticated operating environment. The world clearly needs the Kool-Aid that the folks are selling, but getting it and enjoying the benefits is not going to be easy; the organizations that succeed will probably pick off pieces and get to the end point faster than those that take on a massive change all at one time. In a way, selling SOA is a lot like selling industrial grade trash compactors. While important, trash compactors aren’t sexy; the same is generally true about software architectures. New, glitzy systems are sexy, new systems that bring in lots of new customers are sexy, but software architectures are not. Moving to a new systems architecture takes time and education. Moreover, there is the learning curve. History has shown that the first use of a new technology typically takes twice as long as the second. SOA is becoming more mature in those firms that have been using it for some time, but it is at ground zero in many organizations. The good news is that those organizations that do invest in

software architectures and stick with the program usually reap the rewards. The organizations that invested in data warehousing in the 1990s, for example, are now able to reap real rewards from those investments. The companies that are able to take advantage of master data management, for example, are largely those that made those important investments in the data warehousing. My guess is that the organizations that take a long-range view toward SOA will be the ones that benefit the most from it. I know of companies that have been engaged on the path toward distributed, flexible, model-driven applications for more than a decade now, going back to work on CORBA and DCOM and any number of other initiatives. At base, however, large-scale applications still have the same components that we talked about at the beginning of this report. Over time, as we get smarter and smarter, we will be able to tease the parts apart and eat our really big elephant one bite at a time. Before we conclude, I want to say a word about SOA and governance. One of the problems that I’ve seen with some of my clients is a conflict between the SOA developers (mostly Java/XML or .Net folks) and DBAs. The most serious issue I’ve seen had to do with who had the right to update what data. Now, I often end up on the wrong side of debates with DBAs, but with respect to updating,

I believe that a representative of the owners of the data should be responsible for defining the updating rules of the source data. Now, in a great many cases, this hasn’t turned out to be a problem since a huge percentage of the SOA deployments are BI applications. Finally, there is a strong danger from an enterprise architecture standpoint that SOA will turn into a “systems veneer,” not unlike the “screen scraper” technology that was popular in the late 1980s and early 1990s. The idea of composite applications makes it seem okay to build advanced systems based on very old, very complex legacy applications. In many cases, it seems to me, this is somewhat akin to building a very big deodorizing plant over an open sewer — you can make things seem better, but you haven’t gotten at the root problem. Old, complex legacy systems cannot just be hidden from view because they are so bad and interfaced with so many other systems that everyone is afraid of touching them. Using SOA as a means of providing slicker Internet interfaces to integrated data achieves a fraction of the work that needs to be done. On the other hand, people are using SOA as a way to hide the transition from old platforms and databases to new ones. I talked to a number of individuals working on such problems. The data adapters are built to “expose” key

VOL. 7, NO. 11

22 data while hiding the structure of the underlying source; then the database is reengineered piece by piece, the data is converted, and the data adapters are modified to correspond to the new database design. SOA represents a new way of thinking about systems. There is a great deal to learn. Only by having a really solid architectural framework and a plan to solve our enterprise long-term as well as short-term problems will it achieve what really needs to achieved. ENDNOTES 1FiereWorks.

“SOA Definitions” (www.fiereworks.nl/ soa-definitions.html).

2From

the “IT Lecture Notes by Mark Kelly: Official IT Glossary” (www.mckinnonsc.vic.edu.au/ vceit).

3Rogers,

Everett M. Diffusion of Innovations. Free Press, 1995.

4Christensen,

Clayton M. The Innovator’s Dilemma. Collins, 2006.

5Moore,

Geoffrey A. Crossing the Chasm. Collins, 2006.

6Orr,

Ken. “Managing Technology Decisionmaking.” Cutter Consortium Business-IT Strategies Executive Report, Vol. 5, No. 9, September 2002.

7Foote,

Brian, and Joseph Yoder. “Big Ball of Mud,” 1999 (www.laputan.org/mud).

VOL. 7, NO. 11

BUSINESS INTELLIGENCE ADVISORY SERVICE 8Barry,

Douglas K. Web Services and Service-Oriented Architectures. Morgan Kaufmann, 2003.

9SearchSOA.com

glossary (http:// searchsoa.techtarget.com/ gDefinition/0,294236,sid26_ gci750567,00.html).

10A

mathematical function is actually defined as a transform that maps one or more inputs into a single output.

11IDEF0

is an international standard that was originally called SADT and developed by Doug Ross and a number of collaborators at a company called SofTech in the 1970s. It has proved to be a very powerful way to describe complex functional process diagrams.

12Yourdon,

Edward, and Larry L. Constantine. Structured Design. Prentice Hall, 1979.

13Notice

that this is a rather different tact than that of object-oriented (OO) design, where a large number of behaviors (methods) are bundled with individual objects.

14A

generalization of the if-then-else construction was the CASE construction, a multiway branch.

15From

the “Pipeline (Unix)” entry on Wikipedia: “In Unix-like computer operating systems, a pipeline is … a set of processes chained by their standard streams, so that the output of each process … feeds directly as input … of the next one. Each connection is implemented by an anonymous pipe. Filter programs are often used in this configuration. The concept was invented by Douglas

McIlroy for Unix shells and it was named by analogy to a physical pipeline” (http://en.wikipedia. org/wiki/Pipeline_(Unix)). 16See

www.answers.com/ topic/bus-computing.

17A

note of caution: since software buses are by nature virtual, it is important to examine proposed ones for both logical consistency and power. It is important not to confuse international standards with elegant or even workable bus structures. In the complex world of Web-based software development, there are competing standards, and just because one group has a solution that some of the major players agree on doesn’t necessarily mean that that standard will win out in the marketplace or even work for every class of problem.

18McDonald,

Carol. “Orchestration, Choreography, Collaboration and Java Technology-based Business Integration.” Carol McDonald’s Blog, 30 October 2003 (http:// weblogs.java.net/blog/ caroljmcdonald/archive/2003/ 10/orchestration_c.html).

19It

is not only possible, it is critical that all but the most vital physical issues be hidden from analysts, designers, and developers. History has shown that every time we let people know too much about the inner workings of some software tool or its environment, they invariably take advantage of the peculiarities of those inner workings so that when we decide to change them, it is often a wrenching experience.

www.cutter.com

EXECUTIVE REPORT 20The

advantage of this form of data definition over XML is that there are tools already available that will take a large number of definitions like the one in Figure 11 and automatically define a normalized database from them and automatically build the data navigation to make the data view services operate smoothly. Moreover, these tools will redesign the database and/or build revised navigation when any of the data views change.

21Data

in the context of the enterprise data bus includes unstructured data (e.g., e-mail, documents, images, attachments, multimedia) as well as traditional structured data.

22A

secondary, but important, objective was to spread the computational load across multiple platforms. In addition, instead of expecting operational systems to support increasingly large numbers of reporting and BI demands, redundant “informational (data warehouse/data marts) databases” were created on separate platforms, freeing operational systems to focus on handling transactional data.

23De

Pauw, W. et al. “Web Services Navigator: Visualizing the Execution of Web Services.” IBM Systems Journal, Vol. 44, No. 4, 2005.

ABOUT THE AUTHOR Ken Orr is a Fellow of the Cutter Business Technology Council and a Senior Consultant with Cutter Consortium’s Agile Product & Project Management, Business Intelligence, Business-IT Strategies, and Enterprise Architecture practices. He is also a regular speaker at Cutter Summits and symposia. Mr. Orr is a Principal Researcher with the Ken Orr Institute, a business technology research organization. Previously, he was an Affiliate Professor and Director of the Center for the Innovative Application of Technology with the School of Technology and Information Management at Washington University. He is an internationally recognized expert on technology transfer, software engineering, information architecture, and data warehousing. Mr. Orr has more than 30 years’ experience in analysis, design, project management, technology planning, and management consulting. He is the author of Structured Systems Development, Structured Requirements Definition, and The One Minute Methodology. He can be reached at korr@cutter.com.

VOL. 7, NO. 11

Consulting

Get Expert Advice on All the Business Intelligence Issues You Face Cutter Consortium offers advice and guidance from world-renowned consultants. The Consortium features a faculty whose expertise and credentials are unmatched by any other service provider. Moreover, unlike many other consulting firms that use senior partners to sell a job but then assign junior staff to actually perform the work at the customer’s site, Cutter has no junior staff and deploys only its expert Senior Consultants, Fellows, and Technical Coaches on every assignment. Cutter’s expert practitioners have considerable management, technical, and domain-specific experience assisting Fortune 500 and other organizations with everything from IT strategic planning and organizational development to enterprise architecture, program management, data management strategies, benchmarking and measurement, and more. In addition, Cutter does not rely on off-the-shelf solutions like so many consulting firms, but instead customizes every solution to meet each client’s unique needs based on the client organization’s business drivers, culture, technology history, and budget. The Consortium’s great strength is that it can draw on its more than 150 best-in-class consultants to assemble the ideal team for your organization, tackling any challenge that might arise and offering a complete solution from assessment through implementation.

Business Intelligence

Business Intelligence Strategy Consulting Our Senior Consultants consider your business needs as well as your infrastructure requirements and help you prioritize them based on criticality and potential return on investment (ROI). We’ll help you develop a governance structure for your business intelligence initiative that includes prioritization, funding of projects, and resolution of conflict — and we’ll guide you through the necessary steps to develop an implementation strategy.

Decision Support Assessment Consulting

Business Intelligence

The implementation of a business intelligence system should be driven by a strong business need. There must be demonstrable ROI based on a cost-benefit analysis. Before you embark on such an initiative, Cutter Consortium can provide you with a formal assessment of your current decision support environment based on focus areas such as business value and strategic direction, organizational reporting structure, your current technical infrastructure, decision support applications, and future potential need.

Agile Data Warehousing Cutter Consortium’s agile data warehouse methodology, pioneered by Senior Consultant Ken Collier and Cutter Practice Director Jim Highsmith, recognizes that user requirements change over time and thus makes use of development techniques that are featuredriven, iterative, and collaborative. It will help your firm improve its business intelligence capabilities using state-of-the art technologies and practices. You’ll get actionable business intelligence, providing insights into the business that can be leveraged into increased revenue and greater profit margins. Agile data warehousing not only yields benefits at twice the rate as compared with a traditional data warehousing plan, but your team will gain the infrastructure and skill sets needed to repeat this success going forward without the need for external resources. A Cutter engagement can provide your company with: Design of data warehouse architecture,

data model, and business logic for extractions and transformations First-production version of a data

warehouse

Business Intelligence

A foundation and core of agile data

warehousing expertise through effective knowledge transfer A smoothly functioning business

intelligence team working at increased levels of productivity Agile business intelligence training for

your business users

Data Quality Organizations with top-quality data are better positioned to succeed. Exceptional data can enable a company to reduce costs, make smarter decisions, and have more satisfied customers. The volume of data that flows through your organization every day is immense. From customer orders to changes in inventory levels, purchased demographic data, financial data, and more, Cutter Consortium can help you assess all your data quality needs, focus on your most important data, and implement a rigorous approach to data quality that returns benefits to the business: More accurate and easier-to-understand

financial reports Data that better enables operations

departments to deliver the right goods and services to the right customers, on time and at a lower cost Improved market data resulting in a better

understanding of competitive position More robust and accurate customer

data, improving the ability of the sales department to maximize the lifetime value of a customer More trustworthy data entering the

organization from external suppliers More pertinent, timely, and accurate

information for decision makers An opportunity to “build in” data quality A corporate focus on the most important

data A data quality engagement may consist of: A leadership roadmap to help leaders focus

the effort, build support, develop needed

policy and goals, measure progress, and manage change A data quality assessment comparing your

organization’s data quality program against best practices, resulting in a series of recommendations A study of your data supplier management

practices and a suggested structure to systematically improve quality and reduce costs associated with managing external suppliers of data; this effort can also be beneficial when applied to your crossorganization information chain A roadmap for building data quality into

new systems, such as CRM applications, to ensure these systems can really provide the business benefits they promise A mentoring program, including educa-

tion and real-world testing, to train an individual(s) to be the data “maestro” or go-to person for your organization’s data quality needs

Knowledge Management Assessments Consulting The breadth, depth, and volatility of the knowledge required to deliver world-class information systems is truly daunting. Learning more efficient ways to find, access, and use the expertise of people both inside and outside one’s own organization can increase the effectiveness of your application delivery management. Cutter Consortium Senior Consultants help you determine where and how to target your knowledge management efforts and help you outline a program to measure those efforts. You’ll learn how to leverage the expertise that already exists in your organization. We’ll also help you select the most appropriate knowledge management tools. Your resulting knowledge management activities will produce real, traceable value to your enterprise that blends both human and technological expertise.

Consulting CRM Readiness Assessment Cutter will assess your organization’s CRM environment and its supporting customer technology to help it achieve its CRM goals. This assessment is designed to validate the CRM strategy, boost the CRM technology initiative, and provide actionable recommendations based on a review of current CRM plans. The benefits of the assessment include the identification of potential CRM issues that could hinder ultimate use of customer technology; the evaluation of current organization structure, culture, and CRM strategies; and the evaluation of the current technology environment. The assessment will highlight strengths, risks, and opportunities for improvement, as well as recommend next steps. Cutter Consortium’s consultant compares your current environment against CRM strategy and technology best (and worst) practices across a variety of industries. The overall objectives satisfied by the assessment are to examine the cultural, organizational, and strategic aspects of the company’s business and information technology environments as they pertain to CRM and any operational data store and data warehouse efforts. Cutter will evaluate the following key CRM areas: Implementation of a coordinated, customer-

focused business strategy Creation of a CRM-friendly organization

structure Establishment of a CRM-savvy organization

culture Utilization of a comprehensive definition

of “customer” Implementation of an integrated customer

information technology environment In addition, Cutter will gauge how prepared your enterprise is to efficiently utilize customeroriented technology, including an operational data store, customer contact technology, and data warehouse environment, and we will provide specific, actionable optimization and risk mitigation recommendations to improve the potential for success in future or current CRM initiatives.

About the Practice

Business Intelligence Practice The strategies and technologies of business intelligence and knowledge management are critical issues enterprises must embrace if they are to remain competitive in the e-business economy. It’s more important than ever to make the right strategic decisions the first time. Cutter Consortium’s Business Intelligence Practice helps companies take all their enterprise data, augment it if appropriate, and turn it into a powerful strategic weapon that enables them to make better business decisions. The practice is unique in that it provides clients with the full picture: technology discussions, product reviews, insight into organizational and cultural issues, and strategic advice across the full spectrum of business intelligence. Clients get the background they need to manage technical issues like data cleansing as well as management issues such as how to encourage employees to participate in knowledge sharing and knowledge management initiatives. From tactics that will help transform your company to a culture that accepts and embraces the value of information, to surveys of the tools available to implement business intelligence initiatives, the Business Intelligence Practice helps clients leverage data into revenue-generating information. Through Cutter’s subscription-based service and consulting, mentoring, and training, clients are ensured opinionated analyses of the latest data warehousing, data mining, knowledge management, CRM, and business intelligence strategies and products. You’ll discover the benefits of implementing these solutions, as well as the pitfalls companies must consider when embracing these technologies. Products and Services Available from the Business Intelligence Practice

• • • • •

The Business Intelligence Advisory Service Consulting Inhouse Workshops Mentoring Research Reports

Other Cutter Consortium Practices

Cutter Consortium aligns its products and services into the nine practice areas below. Each of these practices includes a subscription-based periodical service, plus consulting and training services.

• • • • • • • • • •

Agile Product & Project Management Business Intelligence Business-IT Strategies Business Technology Trends & Impacts Enterprise Architecture Innovation & Enterprise Agility IT Management Measurement & Benchmarking Strategies Enterprise Risk Management & Governance Sourcing & Vendor Relationships

Senior Consultant Team The Senior Consultants on Cutter’s Business Intelligence team are thought leaders in the many disciplines that make up business intelligence. Like all Cutter Consortium Senior Consultants, each has gained a stellar reputation as a trailblazer in his or her field. They have written groundbreaking papers and books, developed methodologies that have been implemented by leading organizations, and continue to study the impact that business intelligence strategies and tactics are having on enterprises worldwide. The team includes:

• • • • • • • • • • • • • • • • •

Verna Allee Ken Collier Lance Dublin Clive Finkelstein David Gleason Curt Hall David C. Hay Vince Kellen David Loshin Larissa T. Moss Ken Orr Gabriele Piccoli Thomas C. Redman Ricardo Rendón Michael Schmitz Ed Schuster Karl M. Wiig