Business Semantics by The Ken Orr Institute

Business Intelligence Vol. 5, No. 7

Business Semantics by Ken Orr, Fellow, Cutter Business Technology Council

For the most part, todayâ&#x20AC;&#x2122;s information systems are not much more intelligent now than they were 30 or even 20 years ago. Many leading systems thinkers believe that the next big breakthrough will come in getting software to recognize concepts at a higher level â&#x20AC;&#x201D; to deal with the meaning of things and not just their form. Classically, the field of semantics deals with the study of meaning. This Executive

Report discusses business semantics categories and their importance in shaping next-generation information systems.

Cutter Business Technology Council Rob Austin

Tom DeMarco

Christine Davis

Access to the

Experts

Lynne Ellyn

Jim Highsmith

Tim Lister

Ken Orr

Lou Mazzucchelli

Ed Yourdon

About Cutter Consortium Cutter Consortium is a truly unique IT advisory firm, comprising a group of more than 100 internationally recognized experts who have come together to offer content, consulting, and training to our clients. These experts are committed to delivering top-level, critical, and objective advice. They have done, and are doing, groundbreaking work in organizations worldwide, helping companies deal with issues in the core areas of software development and agile project management, enterprise architecture, business technology trends and strategies, enterprise risk management, metrics, and sourcing. Cutter offers a different value proposition than other IT research firms: We give you Access to the Experts. You get practitioners’ points of view, derived from hands-on experience with the same critical issues you are facing, not the perspective of a desk-bound analyst who can only make predictions and observations on what’s happening in the marketplace. With Cutter Consortium, you get the best practices and lessons learned from the world’s leading experts; experts who are implementing these techniques at companies like yours right now. Cutter’s clients are able to tap into its expertise in a variety of formats including content via online advisory services and journals, mentoring, workshops, training, and consulting. And by customizing our information products and training/ consulting services, you get the solutions you need, while staying within your budget. Cutter Consortium’s philosophy is that there is no single right solution for all enterprises, or all departments within one enterprise, or even all projects within a department. Cutter believes that the complexity of the business technology issues confronting corporations today demands multiple detailed perspectives from which a company can view its opportunities and risks in order to make the right strategic and tactical decisions. The simplistic pronouncements other analyst firms make do not take into account the unique situation of each organization. This is another reason to present the several sides to each issue: to enable clients to determine the course of action that best fits their unique situation. For more information, contact Cutter Consortium at +1 781 648 8700 or sales@cutter.com.

Business Semantics BUSINESS INTELLIGENCE ADVISORY SERVICE Executive Report, Vol. 5, No. 7

by Ken Orr, Fellow, Cutter Business Technology Council Semantics is the branch of linguistics that studies meaning in language. One can distinguish between the study of the meanings of words (lexical semantics) and the study of how the meaning of larger constituents comes about (structural semantics). Semantic role — which is also called deep case, semantic relation, or thematic role — is a description of the relationship that a constituent plays with respect to the verb in the sentence. The subject of an active sentence is often the agent or experiencer. Other roles can be instrumental, benefactive, or patient-based, such as the following: Peter (experiencer) died. The cat (agent) chased the dog (patient).

Sometimes it is hard to break through the complexity of

language to get at reality. Even the most fundamental sciences are often portrayed in mathematical complexities that mask their true objective. The study of intelligence, for example, has become a jumble of conflicting directions, down to the emergence of artificial intelligence, which as far as we can tell has almost nothing to do with real intelligence and a great deal to do with mathematics or symbolic logic. So too with the idea of semantics. Semantics is, or ought to be, the study of meaning in language. Recently, however, semantics has become yet another field of abstraction delimited by mathematical expressions and conjecture. But real semantics should not just be a mathematical representation; rather, it is about

classes of words and phrases (categories) that signify something — something in the real world that relates to other things in the real world. This Executive Report focuses on business semantics, which is the study of meaning in businesses and, particularly, the study of meaning in business systems. This report is the result of more than two decades of research and practical application. Business semantics outline interactions that lie at the base of most, if not all, information systems. If understood in the right context, these models make it possible for business analysts, systems architects, requirements engineers, and systems designers both to understand and to model the real world with a much higher degree of

2 fidelity. Moreover, these models bring new insight into why certain organizational, data, and processing strategies reappear in every generation of systems whether they are manual or automated. When I proposed this topic as the subject of a Cutter Consortium Executive Report, the editors were concerned that the information might overlap with recent similarly titled recent work. I then reviewed several recent articles with the phrase “business semantics” in the title. Fortunately — or unfortunately — I found little overlap. The subject of this report is, I think, closer to the original meaning of the word “semantics”: that is, the study of meaning. This report discusses the meaning of things as applied specifically to the business systems environment. BUSINESS SEMANTICS AND THE REAL WORLD Although business semantics is not an oft-heard phrase, it is no less important because of its lack of use. It would be ideal if the terms of expression that we include under the heading of business semantics were all carefully expounded from the outset, but that’s not the case. The terms and phrases used in this report are the direct result of doing tasks and then stepping back and trying to understand (1) what we have

BUSINESS INTELLIGENCE ADVISORY SERVICE done, (2) where and why we have been successful, and (3) the kinds of patterns present among the most successful examples. It’s important to use definitions carefully because their meaning can be misconstrued. From all our experience, we’ve learned that words matter. The right definitions can make all the difference. We’re convinced, for example, that the success of enterprise architecture is in large part derived from having the right terms, diagrams, and sequence of steps. If you want to be successful, all these pieces have to fit together. The more you use the right definitions, the easier it is to see semantics come to life. All business systems are about the real world. That’s not to say, of course, that technology doesn’t play a major role. It is just that, in the end, business systems are about the business and the business environment. We talk here about actors, messages, subject/ objects, and events because they are most evident and significant when dealing with large-scale systems. Moreover, they are what must be modeled correctly if you are to come up with the right systems architecture. The business semantics categories and concepts discussed here, then, are taken not only from IT literature

but from business, economics, and systems feedback research as well. The Beginning of Semantics

Plato’s student Aristotle shifted the emphasis of philosophy from the nature of knowledge to the less controversial, but more practical problem of representing knowledge. His monumental life’s work resulted in an encyclopedic compilation of the knowledge of his day. But before he could compile that knowledge, Aristotle had to invent the words for representing it. He established the initial terminology and defined the scope of logic, physics, metaphysics, biology, psychology, linguistics, politics, ethics, rhetoric, and economics.

— John Sowa, Knowledge Representation When someone asks me for references on database design, systems modeling, or business semantics, I frequently recommend the Organon, Aristotle’s collected works on logic. This isn’t a trick; Aristotle really is the place to start. Even after nearly 2,500 years, Aristotle’s writings on categories, propositions, syllogisms, and reasoning have never really been equaled, nor has the scope of his intellect. Countless fundamental 21st-century words such as

The Business Intelligence Advisory Service Executive Report is published by Cutter Consortium, 37 Broadway, Suite 1, Arlington, MA 02474-5552, USA. Client Services: Tel: +1 781 641 9876 or, within North America, +1 800 492 1650; Fax: +1 781 648 1950 or, within North America, +1 800 888 1816; E-mail: service@cutter.com; Web site: www.cutter.com. Group Publisher: Chris Generali, E-mail: cgenerali@cutter.com. Managing Editor: Cindy Swain, E-mail: cswain@cutter.com. Production Editor: Linda M. Dias, E-mail: ldias@cutter.com. ISSN: 1540-7403. ©2005 by Cutter Consortium. All rights reserved. Unauthorized reproduction in any form, including photocopying, faxing, and image scanning, is against the law. Reprints make an excellent training tool. For information about reprints and/or back issues, call +1 781 648 8700 or e-mail service@cutter.com.

VOL. 5, NO. 7

www.cutter.com

EXECUTIVE REPORT “category,” “metaphor,” and “hypothesis” all come directly from Aristotle — even the modern meaning of “meta” is an intellectual inheritance that is directly traceable to the philosopher.1 Aristotle’s work has been so influential on our thinking and our language that it is difficult to frame a thorough discussion about the meaning of things in the real world and how to represent them in language without talking about his ideas. In the Organon, Aristotle begins his discussion about words and sentences (i.e., propositions) by discussing “categories.”2 For Aristotle, there were fundamental categories: Substance Quantity Quality Time Place Action Passion These categories allowed one to talk not just about sentences but also about how to reason with sentences (propositions) through syllogisms. If you read modern books on semantics, you find more formal, mathematical discussions on meaning and reasoning, but they are all based on Aristotle’s foundations. So, following Aristotle, we begin our

discussion of business semantics with categories as well.

Business exchanges

BUSINESS SEMANTICS CATEGORIES

Business roles

Over the years, I have bounced back and forth between describing meaning and things in pictures (drawings) and describing them in words (text). I now recognize that each method of description has its strengths and weaknesses. Pictures tap into the visual part of our mind and allow us to represent certain kinds of complexity that are extremely difficult to represent simply in words. But even in drawings, the words included are almost always vital. Text, on the other hand, allows us to talk about concepts in much greater depth and, in many cases, through stories that are extremely difficult to put into pictures. Text allows the mind to draw its own pictures. Like Aristotle’s Organon, business semantics is based on fundamental categories or predicates. The most important core categories are the following: Actors Messages Subjects or objects Locations Events From these, we can achieve yet another set of more abstract, derived categories:

Business processes

Business process activities and decisions User interfaces/formulas/ decision rules Business relationships Business rules Each of these derived categories is a configuration of the actors, messages, and so forth of the core categories. In the following sections, we describe both the core categories and the derived categories. We discovered the major core categories (actors, messages, subjects/objects, locations, and events) not theoretically but by analyzing why a set of rather simple drawings we now call context diagrams were so effective in helping people to understand business systems environments.3 Figure 1 shows a simple context diagram. Context diagrams show only two kinds (or categories) of things explicitly: actors and messages. When we first started using them, the idea was to represent people, organizations, or systems communicating with other people, organizations, or systems with a minimum of other information. Actors are shown in context (Text continues on page 5.)

VOL. 5, NO. 6

Invoice

Customer

Customer shipment

VOL. 5, NO. 7

Customer payment

Credits

Production report

Delivery date estimate

Complaints

A/R & Sales Accounting

Billing

Production

Production Scheduling

Est imat ing

Sales

Invoice

Material costs

Time cards

Production schedule

Order Production schedule Inventory status

Forecast

Job specs

Time cards

Cost Accounting

Customer payment j/e

Inventory Control

Purc has e reques t

Receipt notice

Receiving

Management P/L

A/P

Payroll

Purchasing

P.O.

Job request Proposal Order

Inventory status

Management reports

G/L report

Material j/e

Purchasing j/e

P.O. Vendor invoice

Cust inv j/e

P.O.

Payroll checks

G/L System

Payroll journal entry (j/e)

Product prices

Vendor shipment

Employee

Vendor Vendor payment

4 BUSINESS INTELLIGENCE ADVISORY SERVICE

Figure 1 â&#x20AC;&#x201D; A business (systems) context diagram.

www.cutter.com

EXECUTIVE REPORT (Text continued from page 3.)

diagrams by ovals, rectangles, or circles. For us, messages are always depicted with one-way arrows.4 For each message in this environment, there is a sender and a receiver. Naturally, the sender is at the beginning of the arrow; the receiver is at the end.5 Actors and messages have names. In the case of actors, these names represent “individuals,” “organizations,” or “systems”6; in the case of messages, the names represent the particular “communication” that transpires between the two actors. Events and subjects/objects can be discovered by analyzing context diagrams. Figure 2 shows a legend for reading, or analyzing, context diagrams and picking out the actors, messages, subjects (objects), and events. Core Categories

In this section, we examine in greater detail each of the five individual core categories introduced above.

“The salesman said …” or “The guard looked …” or “He intended …” comprise the bulk of our conversation. This personalization of meaning in speech carries over into our business and technological lives as well. When we refer to a large organization, we often use a phrase that begins with “they,” “we,” or even “it.” This propensity to anthropomorphize applies to all sorts of activities. For centuries, the law focused solely on “individuals” and “lords,” for example. Only over time did such collective abstractions such as “citizens,” “the crown,” and finally “corporations” take shape as “legal persons” separate from individuals. Today when we talk to users about business processes, people are still fundamental players. In our everyday parlance, we switch easily between “customer” and “Tom Smith” or between “vendor” and “Bill Jones.” If we are discussing an organization, we may refer to Microsoft or to Bill Gates, who personifies Microsoft to the outside world.

Individuals, organizations, and even systems “do things”: that is, they act, which is, after all, what makes them actors. But they can also be acted upon. In this respect, as we will see, they take on the role of subjects or objects of a conversation. Subjects or objects don’t appear explicitly in the context diagram, but they are there implicitly. Things that are acted upon are subjects or objects (the appropriate word would be “passives,” but that sounds awkward in modern English). Most things that are passive are inanimate (for example, products, services, parcels of land, and so on), but some are individuals or classes of individuals (possibly organizations or systems as well) that are viewed as subjects or objects (for example, prisoners, wards of the state, etc.). We discuss subjects/objects more completely below. In the business world, there are major subclasses of actors that are particularly important: customers, vendors, employees,

Actors

In business semantics, actors — the entities that make things happen — are especially important. This shouldn’t be at all surprising since, in natural language, animate things (especially human beings) play an extraordinarily important role. In everyday life and language, we are absorbed with people. Phrases such as,

Actors Messages Subjects

Events Figure 2 — A legend for reading context diagrams.

VOL. 5, NO. 7

6 managers, stockholders, and so on. In most enterprises, even public ones, customers are by far the most important subclass of actors, but vendors, employees, and the like are a close second. In sports franchises and movies, for example, players or performers are the single greatest asset of the enterprise. As it turns out, defining what we really mean by customer, vendor, and the like is one of the most difficult tasks in business systems. Messages

In our everyday speech, messages are also exceedingly important. We talk about sending a letter or an order to a person or an organization; we talk about receiving an e-mail, a shipping notice, or confirmation from a person or an organization. Messages are also referred to as transactions in a wide variety of systems contexts. As we will see, messages — especially structured ones — play a fundamental role in almost all systems: they are principal artifacts that crop up in every generation of business systems. Like actors, messages have a long history in both natural language and business systems. Messages date back to the beginning of recorded history. As discussed here, messages are always sent from an actor to another actor and are always about something. Over time, verbal or symbolic messages (a handshake, for instance) became artifacts, were

VOL. 5, NO. 7

BUSINESS INTELLIGENCE ADVISORY SERVICE written down, and then finally became the basis for business interactions and ultimately business contracts. The first documented businesses messages were actually clay bottles or containers called bulla, which contained tokens (also made of clay) that represented a real thing (cattle, sheep, etc.). These messages were discovered in Mesopotamia and date back to the 8th century BCE. These clay bottles and tokens predate the earliest written script by thousands of years7 and were used when transporting flocks from one location to another, representing a kind of bill of lading. A bottle said something like, “By this bottle, I signify that I am sending you 24 cattle.” When the receiver got the cattle, he broke open the bulla and counted the tokens that were encased inside to check whether he had received what the sender intended (I don’t know how they handled cattle that died or that had calves on the way). But the clay bottles had a singular drawback: you couldn’t see inside them to count the number of cattle in the herd until you broke open the bottle. Over time, this problem was solved by pressing the tokens onto the outside of the bottle while it was still wet, making an impression of each of the tokens that was placed inside.8 Finally, people discovered that they could dispense with the tokens inside the bottle entirely and just send a tablet with the

impressions.9 Many students of ancient writing believe that written language in the Middle East actually began with the introduction of these basic accounting messages and only later expanded to capture the full range of human experience; in a strange way, accounting may have been the mother of literature. So the business message itself is a simple concept, but simple as it is, the business message has served as the fundamental basis of commerce since the beginning of civilization. These messages have been used to convey information and possessions (such as cattle, goats, or bottles of wine). They formed the basis for the earliest accounting systems. Indeed, almost all the earliest examples of written script dealt with primitive or not-so-primitive accounting systems. Messages have been used to report the most mundane matters, but many people believe that they are the basis of all civilized communities. In business systems, the most fundamental messages are communications such as orders, shipments, bills, and payments.10 Of these four fundamental messages, perhaps the most important is the bill — in the sense of an invoice — since it is the formal expression of charges and of the change of ownership.11 But business systems and the law involve many kinds of bills: bills of exchange, bills of lading, bills of materials, bills of particulars, and so on. All these kinds

www.cutter.com

EXECUTIVE REPORT of bills are some kind of list. Indeed, lists of things represent a large class of messages that are important in business systems and for which there is a very common template that shows up in all sorts of data models. Business systems involve two primary kinds of messages: external and internal. External messages are usually the most important in business systems because they represent the contracts between the enterprise and its customers, vendors, or employees, or they represent required outputs that must be produced for the system to function. Internal messages represent communication between internal actors. Internal messages often represent key information transformations. For example, shipping notices are external orders transformed with added information about customer location, transportation information, and product information. Messages are just information that represents the communications between two actors. Business messages may take any number of physical forms, including the following: Documents/forms Letters

Voice or other kinds of documented (or implied) communications DNA or messenger RNA The Package (Document) Metaphor for Business Messages

Since everyone is familiar with packages and letters, it’s a particularly useful metaphor for understanding what we mean by business messages. In fact, the whole range of business semantics categories fits well with the idea of a package or letter: the senders and receivers are actors,12 the package or letter itself is the message, and the contents of the package or envelope (the letter) are the subject(s). Finally, the act of sending and receiving the package or letter is a message-related event (see Figure 3). Notice how the package resembles the clay bottles that the ancient Sumerians used 9,000 years ago. On the outside it has

markings, on the inside it has things. Now, if I develop a document that represents the information about the package — its source, destination, contents, and pickup and delivery times — I have what we might call a paper shipping notice. It is not a stretch to see the shipping notice converted to electronic form, transmitted across the world, and used as a key electronic data interchange. In Figure 4, the business semantics shown in the package are now depicted in the form of an electronic message — that is, an invoice. The nature of the business semantics model, then, provides a simple but powerful way to classify and model both data and applications from an overall enterprise or business-unit viewpoint. As we will see, most business processes can be modeled as sequences of messages

Contents: Subject(s)

From: Actor

Box/document: Message Jim Jones JimJim nes Jones Jo 123 123 Main ain Toledo, OH Toledo, OH

Boston, MA 2004.05.10

Time stamp: Event BillSmith Smith Bill Mason Hall Hall Mason Lawrence, KS Lawrence, KS

Packages Electronic signals Coded sequences

To: Actor

Figure 3 — The package metaphor.

VOL. 5, NO. 7

BUSINESS INTELLIGENCE ADVISORY SERVICE of subjects/objects (products). At the beginning, as featured on the left side of the diagram, there is a sending event (send order), and at the end, on the right side, there is a receiving event (receive order).

Actor (from)

Actor (to)

Message (invoice)

Subjects (products)

Figure 4 — The document metaphor.

“Send order”

Order Customer

Order entry

Product “Receive order” Figure 5 — The message as the key business semantics connection.

and activities on those messages that create other messages. From a systems semantic standpoint, actors and subjects (objects) are the stable things (i.e., customers, vendors, salespeople, and products) in the real world that persist over relatively long periods of time. Epistemologically, they exist independently of one another (i.e., the existence of one actor, subject, or object doesn’t depend on that of another actor, subject, or object).13

VOL. 5, NO. 7

This makes actors and subjects (or objects) important in any enterprise architecture or major systems design; but from a business systems standpoint, messages are the glue that connect, or relate, the major business semantics categories to one another. Figure 5, for example, graphically represents the information context of business communication: actor 1 (a customer) sends a message (an order) to actor 2 (an order entry). (Note that an actor can be an individual, an organizational unit, or a system.) This message is about — that is, it refers to — a set

In my context diagrams, messages are the links connecting the actors, but they are also much more. Messages are the DNA of business systems: they contain the common structure of information (data structure, attributes, and links) that are the keys to database design. And as we said previously, sequences (or threads) of messages provide the structure of our most important business processes (see Figures 6a and 6b). In fact, if you have a comprehensive set of messages in a business context diagram, you also have an enormous amount of the key structural information required to understand your business processes. None of this is accidental; it is in the nature of the problem space and of business systems. In the same sense that human anatomy is the basis of medicine, business semantics is the anatomy of business systems. Each generation of systems designers learns for itself the anatomy of systems. Those designers who are capable of looking at systems as a whole find the same major pieces (actors and subjects) and the same connectors (messages). And if they look a bit more closely, they begin to see the events as well. Database designers perhaps see

www.cutter.com

EXECUTIVE REPORT

Order

Order entry

Customer

Entered order

Credit manager

Products entered

Products ordered

Shipment (shipped order)

Approved order

Customer

Warehouse

Products approved

Products shipped

Figure 6a — The sequence (or “thread”) of business messages (the enterprise view).

Order

Order entry

Customer

Entered order

Credit manager

Products ordered

Accounting Products approved

Products entered

Payment

Invoice

Approved order

Accounting

Customer Products billed

Products paid for

Figure 6b — The sequence (or “thread”) of business messages (the customer view).

the structure of large systems better than any other group because they are fundamentally charged with identifying the pieces and the relationships within entire systems and business areas. So just like actors and subjects (objects), messages are important in the domain of business semantics. From a data architecture and an application architecture standpoint, messages — especially major business transactions — provide the clearest and best guide for making architectural groupings. By the time we’ve completed our definitions, the role of business messages (i.e., transactions) will be clear. Subjects (or Objects)14

messages. As we’ve seen, a set of common actors and messages defines a discussion or business process. From a purely informational standpoint, we could call this a communication context or conversation. Typically a business systems conversation is about something. We call that something a “subject” or “object.”15 The following is a list of subclasses of subject: Products Services Parcels of land Jobs Subjects (or objects) are relatively easy to identify once one understands that they almost always exist as the basis of a business systems context. It is obviously easier to pick them out when dealing with a physical subject

or object. It is more difficult when working with abstract kinds of objects. For example, if we were looking to understand and improve our firm’s employment/ time reporting/payroll process, we might have trouble identifying the subject of the business conversation. With experience in business semantics, however, one comes to recognize that the subject is the job or position that the employee holds. Obviously, subjects or objects that are individuals are essentially identical to actors except for their intent. While actors cause things to happen, subjects undergo actions, at least in the context of the given business system. More typical subjects are passive physical things, such as products, buildings, ideas, and personal property. While actors typically have names, subjects have

VOL. 5, NO. 7

10 numbers (IDs). When we explore the data that is normally associated with actors and subjects, we will see that by knowing the semantic category to which something belongs, we also automatically know a whole range of possible attributes from which we can select. This is an important characteristic providing a rich vocabulary with which to design applications. Locations

Things happen at specific locations (or places). In an overwhelming number of business systems, geography plays a major role in operations and management. Technology and communication have made distance less and less of a problem. Today, an increasing number of companies are becoming first national and then global players. The location where something happens is nearly always important. Location information is expressed as addresses and, increasingly, as GPS coordinates. Even as distance becomes less important, location becomes more important because it becomes easier to determine. In fact, with the advent of GPS systems and wireless communication, not only are fixed locations an important category, but transient or real-time locations are as well. GPS coordinates are fast becoming one of the common denominators for integrating all sorts of data visually. External information based on physical location information is also one

VOL. 5, NO. 7

BUSINESS INTELLIGENCE ADVISORY SERVICE of the fastest-growing sets of data purchased by organizations from the outside. Events, Time, and Change

Events are points in time that represent simply either sending or receiving a message.16 Events are characterized as the following: Periodic (daily, weekly, monthly, etc.) Aperiodic (ordering, receiving, etc.) For the most part, the events that you can deduce from a context diagram are aperiodic, or random, events. The event occurs when the sender chooses to send the message when the message happens to arrive at the receiver’s location.17 In real-time systems, aperiodic events are used to synchronize processes. The most common periodic events that are of interest in the business systems context are calendar events, such as end of the day, end of the week, end of the month, end of the quarter, and end of the year. Because so much business reporting is associated with the calendar (standard and/or fiscal), periodic events play a big role in most systems environments. Where aperiodic events have sources, periodic events are driven by clocks and calendars. Significant periodic events are so common that they are often embedded in code rather than in data.

Ultimately, business systems do two major things: (1) they help actors (workers, professionals, and managers) within the enterprise produce goods and services, and (2) they allow actors (managers, analysts, and professionals) within the enterprise to track changes in their business or the real world over time. Monthly, quarterly, and annual reports help managers, stockholders, and others understand the quality of their products, how their products are selling, what their best customers are buying, how many hours their employees are working, and so on. These reports or graphs allow managers to control business operations: to hire new people, eliminate products, embark on new marketing programs, and so on. While aperiodic events allow us to have real-time control over events that need immediate correction, periodic events allow us to provide long-term control. Periodic events are simply markers of time. Students of statistics and mathematics will recall that if we want to track real change as opposed to noise, we need to understand the underlying frequency (rate of change) of the thing we’re studying and sample accordingly.18 Much of the longterm information that an enterprise needs does not reside within the enterprise or its internal systems. For example, if we want to understand the market share of our enterprise, we need information about our sales relative to the sales of our competitors. www.cutter.com

EXECUTIVE REPORT The other major dichotomy within events like messages involves whether the events originate within or outside the enterprise: that is, whether they are external or internal. External events are by definition outside the control of the enterprise, whereas internal events are more or less within the control of the business itself.19 As a consequence, external events, or externally driven events, are typically much more significant than are internal ones. As information systems become more complex and integrated, events (such as time and change) become increasingly important as an element of business semantics. Capturing, updating, retrieving, and synchronizing events are at the heart of all complex systems. Over time, sophisticated business managers and software engineers have come up with elegantly simple ways of dealing with events, time, and change now that information must be captured and managed along with the other more traditional categories of data. Derived Categories

As mentioned, actors, messages, subjects, locations, and events are the core business semantics categories. With these definitions in place, it is possible to build a set of derived but equally important semantics categories, which include business exchanges, business processes, business roles, ©2005 CUTTER CONSORTIUM

business process activities and decisions, user interfaces and formulas, business relationships, and business rules. Business Exchanges

A business exchange represents a series of business messages flowing between two actors that signifies a business transaction.

Previously we talked about the oldest form of business message: the clay bottle. Now we’re going to discuss something even older: the original business transaction (i.e., the business exchange, which is a barter between two parties).20 The earliest form of commerce involved two parties exchanging something of equal value: “I give you a chicken, and you give me some salt,” or “I give you some salt, and you give me a bolt of wool.” These were the earliest and most elementary of all business transactions. In many parts of the undeveloped world, commerce is still carried out this way. But for the most part, the world of commerce has become more complicated and, at the same time, simpler. In a barter economy, traders have to keep all sorts of ratios in mind: salt versus chickens, salt versus goats, salt versus cattle, and so forth. Introducing money makes it easier, which is why it has become the common denominator of commerce. While major business interactions are still about trading something for something else, one side of the

equation usually involves money. I give you money, and you give me a car, for example. But even in the most sophisticated, money-denominated economies, the original idea of bartering is still instantiated in contract law. The key idea in contract law is “consideration,” where each party receives something of value from the other. Even where one party is truly giving something to another party, the giver often receives a nominal amount in return so that the contract is valid (that is, both sides receive something of value). Indeed, the business exchange is such a fundamental idea that it crops up in today’s most sophisticated organizational thinking as a way to define business processes. Business Processes

A business process [is a] collection of related, structured activities — a chain of events — that produce a specific service or product for a particular customer or customers. — US Government Accountability Office, “Glossary of IT Investment Terms”

A business process is a sequence of value-added activities performed by identified “roles” that maps a set of input messages into a set of output messages in a repeatable fashion. At the most fundamental level, business processes are (or ought to be)

VOL. 5, NO. 7

BUSINESS INTELLIGENCE ADVISORY SERVICE

detailed maps of business exchanges. We use a sales order business process as a principal example because it represents a complete business exchange. The essential condition for a true business exchange is that the main participants (or actors) must each get something out of the exchange. For a business exchange to be complete in the case of a sales order, for example, the enterprise and the customer must both get something out of the transaction, which they do: the customer gets products, and the enterprise gets money. Figure 7 shows a stream of actors, messages, and subjects that make up a true business process. The customer gets what he wants (the products he ordered), and the enterprise gets what it wants (the money). If over the long haul, either side fails to get what it wants, of course, the process fails.21 Figures 6a and 6b ultimately provide the basis for developing workflow diagrams that are increasingly used in business

Order Customer Products ordered

Order Entry

systems analysis to document business processes.

But we recognize that some organizations are constrained and must choose smaller business process activities. As a result, in my classes and consulting with clients we frequently talk about “BP” — in uppercase letters — to indicate a business process based on a business exchange, and “bp” — in lowercase letters — to represent business processes built based on existing technology and organizational structure.

Business process analysis involves documenting what the business process is today (the as-is model) and looking across the whole process to see what activities can be improved and what can be eliminated, as well as looking at it from the standpoint of both parties — in the sales order example, the parties are the enterprise (us) and the customer (them).

Over the past 15 to 20 years, the importance of business processes has been increasingly recognized. Driven in part by the work of business researchers such as Michael Porter and Geary Rummler, who popularized the concept of business value chains and business processes, respectively, organizations came to learn that an enterprise could be best understood and measured in terms of its business processes. The more people understood the concept, the more it became clear that businesses created value only through their business processes.

In large organizations, business processes are too often defined in small chunks (the order entry process or the invoicing process), which may correspond closely with existing organizational functions but frequently don’t represent entire business exchanges. We encourage these organizations not to set the scope of their business processing activities too early in their analysis; if they hold off, they will likely be able to view their business process across a much broader spectrum (i.e., an entire business exchange).

Entered order

Credit Manager

Products entered

Shipment (shipped order)

Approved order

Customer

Warehouse

Products approved

Products shipped

Approved order Accounts receivable

Payment

Invoice Customer Products billed

Accounts receivable

Products paid for

Figure 7 — A complete business exchange.

VOL. 5, NO. 7

www.cutter.com

EXECUTIVE REPORT During the late 1980s and early 1990s, the term “business process reengineering” became an IT buzzword. Writers and consultants such as Michael Hammer, James Champy, and Tom Davenport became best-selling writers and speakers if not household names. Business process reengineering began as the idea of creating whole new paradigms for doing business (starting with a blank sheet) but was quickly hijacked by management more interested in downsizing staff or installing expensive client-server applications or COTS packages. Over the past few years, business process analysis has more or less returned to its original roots of analyzing and improving fundamental business processes. This has been aided by the need for organizations to move beyond traditional cost cutting and seriously rethink their basic business. It has also been aided by the emergence of networks, the

Internet, and workflow management tools in automating significant portions of an organization’s operations. Paperless offices are increasingly a reality, and once all (or most) of an organization’s information exists primarily in digital (electronic) form, there is much greater flexibility in manipulating that information. Business Roles

A role [is] a function or part performed especially in a particular operation or process <played a major role in the negotiations>. — Merriam-Webster Online Dictionary From a business semantics standpoint, business processes are a fundamental category. Clearly understanding what business processes are is key to developing the right business solutions. By examining Figure 7, we can see how business exchanges

Customer

Shipment invoice

are linked with the business context information with which we started. It is a very short step to convert these diagrams to business process (swimlane) diagrams (see Figure 8). As you can see, the business process diagram more or less retains the same messages that occurred in the business context diagram (see Figure 1) and changes the names of many of the actors into those of business roles. There is a subtle difference between roles and actors. Depending on the context, actors typically take on different roles. An individual, for example, may in the same day perform the role of mother, employee (at company 1), board member (at company 2), and volunteer (in a political campaign). Actors and roles are easy to confuse — actors are individuals (e.g., Sam and Mary), organizations (e.g., Royal Dutch/Shell Group, Microsoft, and Toyota),

Accept and

Customer service rep Order

Credit analyst

Enter order

Entered order

Sales approval clerk

Warehouse person

Billing clerk

Check credit Approved order

Allocate goods Shipping notice

Billing notice

Ship goods

Bill customer

payment

Process payment

Figure 8 — A business process (swimlane) diagram.

VOL. 5, NO. 7

14 or systems (e.g., A/R, A/P, and ERP); roles are “customer,” “manager,” “employee,” and so on. Natural languages like English or Norwegian tend to lump actors and roles together. In Figure 8, the “customer” is both an actor and a role, so there’s no good reason to change the name. In large organizations where there is enough work for people to specialize in specific jobs, actor and role names are often the same; but in small organizations where individuals have to wear many hats, the same individual may serve many roles. From a semantics point of view, the difference between actor and role is vital to understand, since it can dramatically simplify business processes and avoid confusion. In today’s advanced network environments, it is common to hear network administrators talk about “groups” and “roles,” since these are the items by which security is normally assigned. “Mary Jones,” for example, may be assigned the role of “credit analyst” and “sales approval clerk” at the same time. She may also be temporarily assigned the role of “credit manager” if her boss is on vacation or an extended leave of absence. Increasingly, organizations are moving to the concept of single sign-on and, as a result, want a single location where all of the roles that an individual has been officially assigned can be found to ensure that only people with the

VOL. 5, NO. 7

BUSINESS INTELLIGENCE ADVISORY SERVICE right security have access to sensitive data. The need for a careful definition of roles becomes even more important as top managers increasingly utilize workflow management tools to electronically enable their key business processes. From a competitive standpoint, business processes are critical. One only has to look at the success of companies like Dell to see the impact of having a superior business process in terms of becoming the lowest-cost, most adaptable producer in a high-tech market. Business Process Activities and Decisions

[An] activity [is] a named process, function, or task that occurs over time and has recognizable results. — US Government Accountability Office, “Glossary of IT Investment Terms”

Business process activities represent a unit of work done by a given role at a given point in a business process. Business process activities can be thought of as functions that take inputs and produce outputs. An activity can be either a primitive activity or a subprocess. Subprocesses are defined as components of activities that are too complex to be described in a single statement or program. A business process may require another more

detailed swimlane diagram to describe its subprocesses in detail. A primitive activity can be described in terms of a user interface (a report, screen, etc.) a set of business rules, and a set of data. In Figure 9, “enter order,” “check credit,” and “allocate goods” are all activities. Depending on the business context, all these activities could be primitive, while in other contexts, “enter order” might be a primitive activity, and “check credit” and “allocate goods” might be subprocesses. The principal characteristic of an activity within a business process is that it has specific inputs, specific business rules, and specific outputs and can be done by a specific role under specific circumstances. In this regard, activities look and act like well-behaved modules in a classic modular design. In recent years, there has been increasing interest in services and service-oriented architectures (SOAs). To a high degree, activities look and act like services in this new SOA world.22 Represented in business process diagrams as diamonds, decisions allow for explicit definitions of alternative business flows based on different conditions. At base, business process diagrams are really just traditional flowcharts with additions that allow for concurrent activities and synchronization.

www.cutter.com

EXECUTIVE REPORT

Customer

Customer service rep

Enter order

Preapproved?

Yes

Preapproved order

Order

Credit analyst Entered order

Check credit

Sales approval clerk Approved order

Allocate goods

Figure 9 — Activities and decisions within a swimlane diagram.

User Interfaces and Formulas

User interfaces are the ports with which users view an information system. User interfaces are the products that information systems provide and the only external point of contact between users and the system. Business semantics suggests that user interfaces be closely aligned with user activities. In user interface design, one of the best ways to define a screen or report is to ask, “What’s the best way to present this information so a user can use the information most effectively?” There are two primary things on an output or a screen: variables and labels of the variables. If an output is difficult to understand, it is probably a bad output. By and large, business analysts and developers often gloss over the importance of good user interface design in their rush to capture the

“real” requirements. But user outputs and inputs are the real requirements. They are the only things that a user actually sees and therefore are of enormous importance. Business systems capture, store, retrieve, and present information about the real world, and that real world consists of those actors, messages, subjects, locations, and events that constitute the system itself over time. Business semantics suggests that actors (customers) and subjects (products) are independent of one another. Coupled with experience, the cancellation suggests in most sales order systems that management is going to want to look at which customer bought which products in which regions and which products were purchased by which classes of customers in which region.

Business Relationships

A good deal has been written about customer relationship management (CRM). The idea of CRM is to capture and understand more about customers over time. But much of the CRM literature is highly one-sided. It focuses too much on the enterprise’s knowledge of and control over the customer and too little on the customer’s context and needs. In business semantics, we take a broader view of business relationships: A business relationship comprises all of a major actor’s business exchanges (transactions) (e.g., a customer, vendor, business partner, employee, etc.) with an enterprise. This definition of CRM makes much more sense than traditional ones,23 because it highlights how the customer perceives the

VOL. 5, NO. 7

16 enterprise as well as how the enterprise perceives the customer. Oftentimes, organizations treat their interactions with their customers and vendors as though each individual message or action were separate and unconnected to the others. So, for example, an auto insurer might not hesitate to cancel a customer’s policy after a couple years of accidents caused by the customer’s teenage daughters. But the customer might see it as the insurance company ignoring a 20-year relationship with very few claims, especially once the teenage daughters graduate and go out on their own. This cancellation might make perfect sense from the standpoint of the insurance company looking at one- (or two-) year profit-and-loss figures, but it tends to ignore the long-term relationship. While the enterprise may think nothing of canceling a customer’s policy based on one or two years of bad experiences, customers remember bad experiences for a very long time. In my use of the term here, business relationships are at the same time enormously important and enormously difficult for most large organizations, which have separate systems to handle different business messages (or transactions) for different lines of business. In many cases, these different systems identify the same actor (or customer) differently. Because of this, it is often difficult to relate the information from one system to another. This VOL. 5, NO. 7

BUSINESS INTELLIGENCE ADVISORY SERVICE is the problem that data warehousing was designed to solve. But even with the best tools, data quality is still a significant issue. Large organizations make huge investments in legacy systems and for this reason are reluctant to replace them or merge them into one system. However, when looked at from the perspective of a long-term business relationship, the cost of a major integration process may have high payoff within a relatively short period of time. Business Rules

Business rules are the last and most difficult element of business systems. From a semantics standpoint, a business rule may involve references to all the other semantics categories. Some business rules determine eligibility, calculate royalty formulas, determine frequent-flyer miles, and so on. Whereas there are elegantly simple mechanisms for the automatic design of normalized databases and workflow networks, no such technique has yet been devised for business rules. Still, business rules deserve their own semantics category because of their overall importance to business systems. Indeed, one of the most difficult things about replacing aging legacy systems is their embedded business rules. Complex business rules often come into being over long periods of time, and

their documentation is problematic, at best. THE IMPACT OF BUSINESS SEMANTICS ON DATA AND SYSTEMS ARCHITECTURE So now that we have a set of business semantics categories, here comes the fun part: using the semantics categories to build “smarter” business, data, and application architectures. Business semantics can tell us a great deal about how business systems should be architected. Business semantics provides software architects and designers with a set of templates that helps them quickly determine the overall framework of their systems, even their enterprise databases and applications. During the late 1970s, those of us involved in what is now called the data structure design group gained this insight. During this period, I observed that systems and databases tended to fall into distinct patterns based on what we now call their actors, messages, and subjects. Actors such as customers or vendors, for example, tended to be modeled as tables with unique keys. This was also the case for most subjects, such as products. Messages, on the other hand, tended to have a more complex model. As we will see, messages have two major components and are modeled via a combination of unique and foreign keys. But even though their nature makes them more complex, messages tend to be www.cutter.com

EXECUTIVE REPORT modeled more or less the same way all the time.

collected and updated independently as well.

Data Modeling and Data Architecture

The same holds true for the subject (objects) category. Subjects are independent things; they are individual products (such as hammers, wrenches, or screwdrivers), so they should have unique identification numbers as well. Historically, only expensive products such as cars and computers and television sets have had truly unique IDs (i.e., serial numbers). But with the emergence of radio frequency identification (RFID) technology, the day is coming when nearly every produced product will have its own unique ID, and our business systems will need to be prepared to model billions of unique IDs and capture a whole range of new information about that product including, perhaps, current location and status. The bottom line is that from a business semantics

A well-designed, normalized database for large business systems often contains 200 or more data tables. Of these, my experience has been that only about 10% (roughly 20 tables) are critical to the system design. The remainder are largely interface and reference tables. In my experience, if you get the top 10% right, everything works well. With insights gained from my business semantics work, getting the top 10% right is a manageable task. Modeling Actors, Subjects, and Messages

The first thing to understand about actors (individuals, organizations, and systems) is that in the real world, they exist independently of one another. This is what philosophers call “ontological” independence. Customers, for example, are independent of one another. I may physically inherit genes from my parents, but after a certain age, I don’t physically depend on them for my existence. Things in the real world that are independent of one another ought to be modeled as independent entities or classes in corresponding business systems models. If Frank Jones and Frank Smith are separate customers, they need to be assigned different identification numbers and have their attributes

Customer Customer No* Customer Name Customer Address Customer City Customer State Customer ZIP

standpoint, actors and subjects are modeled in simple tables with unique IDs (see Figure 10). The fundamental data model of messages (transactions) is only a bit more complex than the data models for actors and subjects. In general, there are two parts to a message: a message header and message line item. Figure 11 shows a typical message data structure. The actor, subject, and message data models are the building blocks for most major business systems (we’ll address event/time and location information later). In a sense, actors and subjects represent the nouns of our business semantics sentences (propositions), and messages represent the verbs. With them, we can construct the basic model for a business communication.

Product Product No* Product Serial No* Product Desc Product UM Product Price

Figure 10 — Actor and subject data models.

VOL. 5, NO. 7

BUSINESS INTELLIGENCE ADVISORY SERVICE

Invoice Header Invoice No* Invoice Date Order No Order Date Customer No Customer Name

Acme Manufacturing Invoice

ask and answer the following questions: Q: How were the actors connected? A: Through messages. Q: What did the subject (or object) hang off of? A: The messages.

Invoice Line Product No* Product Desc Quantity Order Product Price Extended Price

Q: How are the actors and subjects connected within a data model based on solid business semantics? A: Through the messages.

Figure 11 â&#x20AC;&#x201D; A typical message data model.

Customer Customer No* Customer Name Customer Address Customer City Customer State Customer ZIP

Invoice Header Invoice No* Invoice Date Order No Order Date Customer No Customer Name

Invoice Line Order No* Product No* Product Desc Quantity Order Product Price Extended Price

Actor

Product Product No* Product Serial No* Product Desc Product UM Product Price

provide the structure for our business systems.

After creating hundreds of data models with nearly identical structures, it occurred to me that there had to be a common underlying pattern. As I reviewed the relationship between these models and the business context and business process models created for dozens of projects, it became clear that a fundamental business semantics was being represented.

These assertions take some explanation. Referring back to the context diagram in Figure 1, we can

The basic pattern for business communication has become a business semantics template that

Message

Subject

Figure 12 â&#x20AC;&#x201D; The data model of a business communication. Modeling Business Communications

Modeling messages is the key to real-world database design. Messages provide links that provide the structure of the key business processes, which in turn VOL. 5, NO. 7

It all makes sense. Business systems are fundamentally models of real-world communication systems, and in communication systems the connection links occur through the messages. In business systems, actors and subjects are related via messages. Figure 12 shows the structure of a traditional data model for a business communication (in this case, an invoice).

www.cutter.com

EXECUTIVE REPORT recurs repeatedly in all business systems. Here the names of the actors, subjects, and messages can be thought of as variables in a mathematical sense. And like variables in mathematical equations, if we substitute one or more variables consistently in our template, we get another well-formed model. For example, if we substitute “vendor” for “customer,” “purchase order” for “order,” and “vendor product” for “product,” we have a common data model for a “purchasing” as opposed to a “sales order” business process. WHAT IS BUSINESS SEMANTICS GOOD FOR? Many people are afraid of theory. When you use words like “semantics” or “ontology” or “taxonomy,” their eyes start to glaze over. But they are just words; they only stump us because we don’t use them very often. This time, the unusual words and the business semantics theory are on our side. Like any good theory, business semantics has lots of practical applications. For example, you can use business semantics to develop a solid enterprise data architecture, to help you design your core data warehouse, or to come up with a high-level application architecture. In fact, you can use this basic set of ideas to do many things. In this section, we’re going to discuss the major uses that result directly from our understanding of the business semantics model.

Enterprise Data Architecture

Templates are great for designing complex things. They allow us, for example, to model any <actor><message><subject><actor> relationship, and any business system involves many of these relationships since messages are the threads from which the foundation of our systems is created. In the same way that we used messages to help us structure our business processes (see Figures 6a and 6b), this communication-based template can be leveraged to build complex data architectures as well. Figure 13, for example, is a high-level enterprise data architecture for a printing company based on the repeated use of this basic template. Note that in Figure 13, we have used the business communication template seven different times, reflecting the seven major messages that are included within this basic market-to-collection business process. Here the seven major messages (job request, proposal, contract, work order, shipment, invoice, and payment) lead us to an understanding of the basic data architecture of a large part of the organization. By starting with a business context model and coming to understand the basic messages and basic business process flow, we not only develop a better data model but tie it back visually and logically to the context and business process models that

we developed earlier (Figures 1 and 8, respectively). It always helps to have a solid discipline like these business semantics templates when tackling big (i.e., new/risky) problems. Engineers, for example, already have a great deal of knowledge about buildings and roads before they construct a new building on a steep hillside or a new bridge over a particularly tricky geological terrain. Like great architects, great business analysts need all the breaks they can get. Like Figure 12, Figure 13 is a template, but it is higher level. Good engineers and architects take advantage of templates or tools like CAD/CAM to do more and more complex things. If we start a new project with a template, then we have a distinct advantage. Figure 14 (on page 21) takes Figure 13 further and shows exactly where the major business components go. All experienced modelers, especially good ones, use mental templates from previous projects. In most cases, the templates come before the insight into the underlying reasons. As I often tell people, we frequently know what works long before we know why. This is certainly true about business semantics. The key foundation ideas behind this kind of thinking are based on the work of a number of different people going back more than 30 years. Several of my colleagues and I

VOL. 5, NO. 7

BUSINESS INTELLIGENCE ADVISORY SERVICE

Employee

Customer

Job request

Proposal

Contract

Work order

Shipment

Job request 1

Proposal item

Contract item

Work item

Shipment item

Invoice

Invoice item

Payment

Payment item

Material

Work station

Equipment

Product

Figure 13 — Enterprise data architecture for a printing company.

worked on data models before there was even an agreed-upon relational database theory. In the 1970s, my late friend JeanDominique Warnier published the book Logical Construction of Systems.24 The book provided the earliest enterprise data architecture of which I am aware. And it too was based on lots of experience as well as theory. When I visited Warnier in the late 1970s, he took me to a client who already had a real-world enterprise data architecture in place and a real, live enterprise data architect on the job (whose

VOL. 5, NO. 7

official title was logical data engineer).

ness semantics is in modeling their enterprise data.

Although I didn’t completely understand Warnier’s enterprise data architecture model at the time, it was clear that there was something profound about his work in modeling both the enterprise’s data and the enterprise’s applications. Years later, I came to realize that Warnier was building an enterprise data architecture based on what we now call business semantics modeling. Not surprisingly, then, one of the first places that most people use busi-

Designing a Core Data Warehouse

A great deal of the theory behind business semantics also emerged as a result of understanding how to make data warehousing work. I was introduced to data warehousing during the late 1980s when I consulted for IBM. A development manager took me aside and asked me to look at a new concept that was in development. So I read some articles, talked to some of the researchers working in this area, and looked at some early product demos. www.cutter.com

EXECUTIVE REPORT

Actors Employee

Customer

Job request

Proposal

Contract

Work order

Shipment

Job request 1

Proposal item

Contract item

Work item

Shipment item

Invoice

Invoice item

Payment

Payment item

Messages Material

Work station

Equipment

Product

Subjects

Figure 14 â&#x20AC;&#x201D; Mapping business semantics categories to an enterprise data architecture.

After I studied the material, I told the development manager that I thought data warehousing was indeed an exciting concept and that I suspected many of his clients had been working on the problem for quite a while. Moreover, I reckoned that if IBM had actually come up with something truly new and useful, its clients and others would be very interested. This territory was so familiar because one of my first big projects was an early data warehousing application. That project had some bad moments, so I was acutely aware of how difficult it ÂŠ2005 CUTTER CONSORTIUM

is to execute these systems. In this project, we had to integrate a lot of data, produce dozens of reports in different sequences, and support truly ad hoc reporting for a number of different user groups. On top of all this, we were attempting these things long before many of the necessary end-user tools were available. During the past 15 years or so, I have worked on more and more data warehousing and business intelligence assignments. In the process, I developed a framework that I refer to as an enterprise data flow architecture (EDFA), which is illustrated in Figure 15. I came up

with the architecture because there were several competing theories about how best to do data warehousing, particularly regarding the difference between data warehouses and data marts. As I explored the problem, I came to the conclusion that a robust data warehousing strategy needed to include both a core data warehouse (CDW) as well as several data marts. The EDFA addressed the data warehouse versus data mart issue by including both, but it emphasized the data warehouse component. Indeed, the key element of the data warehousing VOL. 5, NO. 7

BUSINESS INTELLIGENCE ADVISORY SERVICE

Internet/intranet layer 11 Internal ops providers

Direct queries

Internal users

Virtual queries Ad hoc queries

Virtual DW

Coarse DW

Central DW

Operational data layer 2a

External users

External providers

Distributed DW 3 Core DW layer

Presentation/ desktop access layer 1

Data mart layer 4

Data feed/ data mining/ indexing layer

External data layer 2b Data staging and quality layer 5

Data access layer 7

Non-operational data layer

Internal non-ops providers

Metadata repository layer 8 Warehouse management layer 9 Applications messaging (transport) layer 10

Figure 15 — Enterprise data flow architecture.

framework is the middle “core DW layer” in Figure 15. The core data warehouse is the central clearinghouse for all information in the warehouse. All the elements on the right side of the CDW are involved with extracting, cleaning up, and integrating data for loading into the CDW, while all the elements on the left side are involved in indexing, searching, retrieving, and presenting data, either directly or via data marts or data cubes. From my standpoint, then, the design of the core data warehouse is critical. In addition to the issue of data warehouses versus data marts, the other major design issue concerns whether the core data warehouse should contain atomic transaction data or only summary data. Historically, end-user reporting applications relied heavily on summary data, in VOL. 5, NO. 7

large part because of storage and performance reasons. But my analysis indicated that since the cost per byte was plummeting, storage costs were less of a problem. I also concluded that for the sake of flexibility, to answer ad hoc queries and because of the need to relate summary data back to detail transactions, a CDW ought to contain both atomic and summary data and focus on the atomic data (see Figure 16). As you can see, even the layout of our CDW leverages our business semantics categories. In data warehousing parlance, there has been a great deal of discussion about “facts” and “dimensions.” As it turns out, facts are nearly always related to messages (transactions), while dimensions tend to be related to either actors or subjects (objects). As a result, we have been able to quickly help

people come up with sound data warehouse designs based on sound business semantics. The approach is (1) to work backward from the most likely outputs or query sets to the minimal data needed to produce those outputs and then (2) to relate that information to the actors, messages, subjects, and business process of the supporting business.25 Even after nearly two decades, data warehousing is still confusing to many end users and even to a fair number of database designers and developers. This shouldn’t be. We find that the use of a consistent business semantics nomenclature throughout the process helps business users and IT developers do a better job of communicating and understanding the nature of their business and their data. Talking to business users about actors, messages, and www.cutter.com

EXECUTIVE REPORT

Summary/ Hierarchical Data

Company

Region

Product Family

Territory

Product Class

Customer

Product

Summary Messages (Fact Tables) Monthly Sales (Current Year) ‡

Invoice Header

Base (Atomic) Data

Invoice Line

Actors

Objects

Dimensional Data

Message (Transactional) Data

Figure 16 — Layout of a core data warehouse.

subjects in the context of data warehousing takes a bit more time, but once the concepts take hold, it’s much easier for people to see how all their systems and data relate to the important things in their world. Application Architecture

Form ever follows function. — Architect Louis Sullivan

Clearly, business semantics helps in rationalizing our business and data architectures, but it has significant uses in application architecture as well, especially at the highest levels. As the outline of business semantics became clearer in the late 1980s, we

began to see that these same basic semantics categories were visible even in our highest-level systems thinking. If you look at any account (financial) software, for example, you will invariably find a set of primary components: accounts receivable (A/R), accounts payable (A/P), payroll, and general ledger (G/L); and if you are a manufacturer, there will also be an internal cost transfer component (C/T). Paralleling these accounting processes is a matching set of business (operational) components as well: sales, purchasing, human resources (HR), executive management, and manufacturing (see Table 1).

What is perhaps most surprising is how constant these functions are from a systems, accounting, and even an organizational standpoint. As far as we can determine, this framework of business accounting and operations domains was around long before computers and hasn’t changed materially over time. I believe that’s because the designation of these areas arose out of a trial-and-error discovery of which accounting/ management structures worked best in the real world and that this structure became a fundamental part of accounting. Figure 17 portrays this observation graphically. The reason that A/R (sales), A/P (purchasing), payroll (HR), and

VOL. 5, NO. 7

BUSINESS INTELLIGENCE ADVISORY SERVICE

cost transfer (manufacturing) — all the italicized elements in Figure 17 — work as major application areas is that they each deal with a group of closely related things: a single major external actor, a major set of external messages (transactions), and a major external subject (object). In a way, each of these

systems is both productive (i.e., it produces something of value) and canonical (i.e., it is symmetrical). Clearly, we are not the first to stumble on this insight. For example, J.D. Warnier built his entire systems/data/organizational architecture on the concept of just four canonical messages (orders, deliveries, bills, and

Table 1 — The Major Accounting/Operations Application Areas

Accounting A/R A/P

Operations Sales Purchasing

Payroll

Human resources

Cost transfer

Manufacturing

G/L

Executive management

Departments (Internal Vendor and Customer)

Internal Service

yc he W or e sh ck k ee pr As t od sig uc nm t en t m

Payroll

Internal Sales

Management Reporting

Cost Transfer

G/L

Sales

Ve n

do P. rs O . hi pm or en in nd t vo or ic e pa ym en t

Purchasing

Vendor

In te al rna lP s te hi rn C pm .O. os al en tt i n ra t vo ns ic e fe rm em o

A/P

A/R

Cu Cu st om st om er Cu er or st de s o hi Cu m r p er m st e om in nt v er oi c pa e ym en t

Enterprise Product/ Service

Customer Vendor Product/ Service

Figure 17 — External actors, messages, subjects, and application architecture.

VOL. 5, NO. 7

If you look at organizations from a bottom-up perspective, the big picture is often hard to grasp, but if you view the big picture from a top-down perspective in the right way, powerful patterns begin to jump out. For decades, IT has pursued reuse; business semantics starts with the idea of reusing the most important categories — the things at the top that are most stable. By leveraging (that is, reusing) the patterns that come from business semantics, the whole world of services takes on new meaning. THING 1/THING 2 THEORY: ENTITIES, OBJECTS, AND BUSINESS SEMANTICS

Internal Product/ Service Employee (Internal Vendor)

payments). He believed that just as there was a logical way to build databases and applications, there was also a logical way to organize businesses.

For at least 2,500 years, the search for the right way to think and talk about things has been a preoccupation. Philosophers, theologians, biologists, psychologists, linguists, and now computer scientists have attempted to make sense out of the real world — one might suppose — since the beginning of civilization. If our form of business semantics is to prove useful, it is important going forward to relate these business semantics categories with the most powerful trends in IT that have come before. One of the issues we have to deal with is finding the correct

www.cutter.com

EXECUTIVE REPORT level of abstraction. Too little abstraction and you get mired in too much detail; too much abstraction and there’s no communication. Some years ago, a client of mine told me about his “thing 1/thing 2 theory.” Everything (for instance, actors, messages, nuclear physics, or DNA) could be reduced to just one relationship: thing 1 does something to thing 2. By combining large numbers of these relationships, he said, you can explain anything (that is, everything). Unfortunately, no one but my friend could do much with this theory; it was simply too abstract. Over the past 20 years, IT has been occupied with its own versions of the thing 1/thing 2 theory. First, it was all about entities. This began in the late 1970s, when Peter Chen came up with entityrelationship (ER) diagrams. As Chen maintained, most of the important things we want or need to say about the world can be mapped into statements about entities and their relationships with one another. In a way, this has turned out to be true. You will notice, for example, that we’ve been using ER diagrams to describe our data models here, and they work just fine. ER diagrams are good for medium-sized problem domains where you might have a few dozen entities and relationships. The problem with ER diagrams is that they get big and complicated in a hurry, and they end up

containing hundreds of entities and hundreds of relationships. After a while, it gets difficult to see the forest for all the trees, especially on very large projects. Fundamentally, ER diagrams are network diagrams or complex network diagrams. Over the past few years, the second version of the thing 1/thing 2 theory has been objects. Objects, via class diagrams, have become even more ubiquitous than entities. The 1990s were clearly the object decade. Object class structures and object class diagrams have become popular for modeling the real world and programming as well. Now object theory is based heavily on inheritance, mostly single inheritance. This means that an object that is subordinate to another object inherits all the attributes (and methods) from the other objects that are higher in the hierarchy. Object class structures allow one to model things top down, for example, from actors to individuals to customers. This turns out to be a good thing for structuring actor and subject categories but not so good for modeling messages and events. My friend and Cutter Consortium Senior Consultant Arun Majumdar is fond of saying that object databases are good at modeling vertical (hierarchical) relationships and not so good at modeling horizontal (network-based) ones. My own feeling is that neither entities nor objects ultimately

represent the right basis for business semantics; they are at least one level too high. As it turns out, you need to be able to model both hierarchical and network relationships, but more important, we need to use categories of data that actually mean something — that is, categories that have some inherent ontological meaning in the real world. Both entities and objects are so universal that you can’t say anything specific about them. On the other hand, when you say that something is an actor, you already know quite a lot about that thing. For one, you know that it falls into one of three major subclasses: individuals, organizations, or systems. For another, you know that if it is an individual, it will have, for example, a first and last name, a sex, an address, height, weight, date of birth, and so on. The attributes (descriptors) work across the board. If you are defining data for an actor/individual, you already know a great deal about what he or she is like. The same holds true for subjects. If the subject is an individual, you know the same things about him that you do about an actor who is an individual. If the subject is a product or a parcel of land, you also generally know a lot about that kind of thing as well. (You can even go online and find out what information other people keep for products or parcels, and it is likely that you will need the same information.)

VOL. 5, NO. 7

26 Obviously, this is true for business messages as well. You know from the start that orders or shipments or invoices have a basic structure. They have a header, which contains information about the invoice as a whole, and detail information, which has information for each one of the things the order or shipment or invoice is about (i.e., messages are about something). Natural languages like English or Polish or Bengali take advantage of semantical clues to string together information. In sentences, once you’ve mentioned someone — say, Aristotle or Madame Curie — you can then say “he” or “she,” and the listener or reader can figure out who you mean. With semantics-based modeling, we know who our actors are and what (or who) our subjects are. We also know what the messages are, who they relate to, and what subject they are about; we have introduced meaning into our systems discussion. Semantics-based modeling allows you to build progressively smarter systems. Hopefully, one day our systems will be able to think beyond the two- or threeyear-old level they have been stuck at for the past 30 or 40 years. All this is extremely important if we are going to create the next generation of truly smart applications and enterprises. Most of the systems that we design and install today are not noticeably more

VOL. 5, NO. 7

BUSINESS INTELLIGENCE ADVISORY SERVICE intelligent than those we designed and installed a decade or even two decades ago. Indeed, in many cases they are not even as smart, because a decade or two ago, analysts and designers were, for the most part, much closer to the business than we are today. As I said previously, nearly 30 years ago, I met a client of J.D. Warnier who already had in place an enterprise data architecture, an enterprise data architect, and a set of integrated, data-based systems that ran on that architecture. Until relatively recently, through work on integrated enterprise resource planning (ERP) systems, data warehousing, and enterprise architecture, I hadn’t seen comparable insight at work. CONCLUSION Like enterprise architecture, business semantics is a critically important activity. Currently, the structure of our business, data, and application architectures plays — and will increasingly play — an important role in every aspect of IT and, therefore, in every aspect of our organizations. The more that these architectures reflect the real things that make up our business environment — the actors, messages, subjects, events, business communications, business exchanges, business processes, and business relationships — the better chance we have of surviving inevitable change.

Technology may have peaked in the stock market, but it has not peaked economically. Enterprises everywhere have more technology, which is good technology, than they can possibly use — and there is more on the way. It would be nice if the world slowed down a bit so we could catch our breath, but that isn’t going to happen. The rate of technological and business innovation is rising, not falling. While we were trying to survive the bursting Internet bubble, international outsourcing happened. While the telecommunications industry was trying to figure out how to use the vast fiber-optic infrastructure in which it had overinvested, Voice over IP happened. While we were trying to figure out how to use PDAs and tablet computers, really smart cell phones with GPS, built-in digital cameras, and hard disks happened. Business semantics simply says that all of our systems are (or ought to be) based on the major things (semantics categories) in the real world, and these things are not going away. If we are to stay in business, we must have customers, and those customers will be actors, and those actors will come in one of three flavors: individuals, organizations, or systems (the last of which are really the computers of individuals or organizations).

www.cutter.com

EXECUTIVE REPORT And if we are to stay in business, we will need more and more refined/smart (electronically supported) business processes. Time and distance mean nothing to computers and electronic communication, but they still mean something to people. New technologies will bring new ways of working. A good portion of this report was written on my back porch using a laptop with a large, bright screen. Every day I communicate with colleagues and clients around the world. The only way that I can do so is by using technology as best as I can. I know I’m not using the latest technology, but even using commercially available technology, I’m way ahead of a lot of people, even some of my clients.

science of business analysis, organization, and systems. In order to do so, we also have to invent the words for representing it.

But technology has to include meaning as well as speed. Our systems have to become smarter so that people can use them better. Unlike computers and telecommunications networks, people need buffers. People need rest and relaxation. People need information in the middle of nowhere, and they need to be able to find privacy and quiet in the middle of Tokyo or London. And if people are going to become totally dependent on technology to support their lives, they must have security.

ENDNOTES

Business semantics is not a buzzword; it is the foundation of business systems thinking. Like Aristotle 2,500 years ago, we are trying to build a new science: a

First and foremost, semantics is about meaning, and meaning is largely about (or at least starts with) things in the real world. Business meaning is about individuals, organizations, and systems — how they interact and what they interact about. The systems that we build to support these individuals, organizations, and systems, along with the persistent memory that we create (i.e., data, information, and knowledge), are the fundamental stuff of our information systems and the framework for our enterprise architecture.

In classical Greek, the word or prefix “meta” simply means “after.” Aristotle’s writings on the foundations of philosophy originally had no title. Because the work appeared in Aristotle’s collected writings after his book of physics, it was referred to simply as “metaphysics.” Through history, Aristotle’s metaphysics came to be so influential that the prefix “meta” eventually took on the meaning “higher,” “above,” or “abstract,” as in “metamathematics.”

The original Greek meaning of category was simply “predicate.”

At various times, these context diagrams have also been called “entity diagrams,” “actormessage diagrams,” and “communication diagrams.” They are so simple and natural that they have been discovered independently countless times. I was first introduced to them in my work during the early 1970s; my friends Morris Nelson and Peter Kitch reinvented the diagrams in the late 1970s when we better understood how useful they really were. Later, I discovered individuals as different as Geary Rummler and Cutter Consortium Senior Consultant Verna Allee using the same diagrams.

Other forms of context diagrams use two-way arrows, but they fail to show the direction of communication.

A sender of a message can simultaneously be a receiver; in other words, an actor can send himself a message. In everyday life, people do this often by sending themselves messages or reminders. E-mail and other electronic tools make this task easy.

While we began with the idea that actors are people (individuals), we came to recognize that in the context of business semantics — especially concerning business systems — the term “actor” had to be expanded to include all things that can autonomously send and receive messages.

VOL. 5, NO. 7

28 7The

earliest examples of Mesopotamian script date from approximately the end of the 4th millennium BCE, coinciding in time and geographic location with the rise of urban centers such as Uruk, Nippur, Susa, and Ur. These early records are used almost exclusively for accounting and record keeping. But these cuneiform records are really descendants of another counting system that had been in use for 5,000 years. As early as 8000 BCE, clay tokens were in use in Mesopotamia for some form of record keeping. This replacement of things by physical tokens and physical tokens by written symbols foretells the same transformation in business systems, where things were represented by numbers and letters and ultimately by electronic strings of ones and zeroes: it was the replacement of atoms with bits and of bits with meaning.

9The

word “cuneiform” comes from the word “wedge” in Latin. A cuneiform was any writing that could be made on a table with a wedge. Jean-Dominique Warnier believed that you could actually model any organization using just these four messages in combination. He even wrote the book Logique de conception des organisations based on this concept.

VOL. 5, NO. 7

BUSINESS INTELLIGENCE ADVISORY SERVICE As in most semantic areas, it is useful to look at the etymology of a term to understand it. In the case of the term “bill,” we find: “‘written statement,’ c.1340, from Anglo-L. billa ‘list,’ from M.L. bulla ‘decree, seal, document’ .... Sense of ‘account, invoice’ first recorded 1404; that of ‘order to pay’ (technically bill of exchange) is from 1579” (from Online Etymology Dictionary; www.etymonline.com/index. php?term=bill).

The notion of location comes into play with the “address” of actors. Addresses specify locations, which is how things are delivered by the postal or delivery system. The creation of e-mail has spawned “logical addresses” that correspond to virtual locations as well.

Persistence is a very important idea in business semantics. We tend to give persistent entities names or ID numbers so that we can refer to them uniquely over time. This carries over into our database design. The files that contain customer information or vendor information are organized based on customer IDs and vendor IDs and are as independent of one another as we can make them.

Over the years, my colleagues and I have gone back and forth about what to call these “things.” The reason that I use subjects (or objects) is in keep-

ing with the fact that sometimes a message refers to a person, in which case it could be considered a subject and sometimes it refers to something passive, in which case it could be considered an object. Moreover, when referring to a conversation, it is proper, at least in English, to say the “subject” of the discussion. It’s important to note that we’re using “object” in a much more restricted sense than those who talk about object-oriented analysis, design, or programming. In this business semantics context, we use object in much the same way as your English teacher might have used the term “direct object” when he or she taught you how to diagram sentences.

In the context of business processes or of workflows, events play an especially important role, but here we define events as something associated with messages.

In more elaborate discussions of context diagrams, it is useful to think of each message being transported through a channel. The postal system, for example, can be thought of as a channel for traditional “snail” mail and the Internet as a channel for e-mail. In some cases, the business analyst is interested in the channel (e.g., when transport time is critical), but for the most part we are not interested in the channel itself.

www.cutter.com

EXECUTIVE REPORT Mathematical analysis tells us that if the frequency of change is n, we should not attempt to sample any more often than n x 2 (which is twice the frequency of change).

The exception concerns periodic events that prompt the generation of externally defined messages. For example, in the US, the W-2 report, which is a statement of earnings, must be sent to the Internal Revenue Service at the beginning of the year. While this event might appear to be internal, it is actually driven by external requirements.

One dictionary defines bartering as a simple form of trade where goods or services are exchanged for a certain amount of other goods or services (i.e., there is no money involved in the transaction). Barter trade was common in societies where no monetary system existed or in economies suffering from a highly unstable currency (as when hyperinflation hits) or a lack of currency.

Many organizations, especially large ones, tend to think primarily about their side of business processes. They badger small vendors, for example, into giving them good prices, then they take an inordinately long time to pay. After a while, vendors simply won’t do business with these companies. Business processes are based on business

exchanges, and business exchanges are two-way streets. In the 1970s, my old friends and Cutter Business Technology Council Fellows Ed Yourdon and Tom DeMarco as well as Chris Gane and others promoted “structured analysis and design.” One major idea that came out of the movement were two concepts that were embodied in good modules. Structured gurus maintained that good modules had high cohesion and low coupling. In other words, these modules did only one or a few things and were lightly connected to other modules. With the advent of SOA, it appears to be a good time to dust off these concepts as we look for design criteria for services.

According to American Teleservices Association, for example, CRM is defined as “The strategies, processes, people, and technologies used by companies to successfully attract and retain customers for maximum corporate growth and profit.”

See Warnier, Jean-Dominique. Logical Construction of Systems. Van Nostrand Reinhold, 1981.

ABOUT THE AUTHOR Ken Orr is a Fellow of the Cutter Business Technology Council and a Senior Consultant with Cutter Consortium’s Agile Software Development and Project Management, Business Intelligence, Business-IT Strategies, and Enterprise Architecture Practices. He is also a regular speaker at Cutter Summits and symposia. Mr. Orr is the founder of and Chief Researcher at the Ken Orr Institute, a business-technology research organization. Previously, he was an Affiliate Professor and Director of the Center for the Innovative Application of Technology with the School of Technology and Information Management at Washington University. He is an internationally recognized expert on technology transfer, software engineering, information architecture, and data warehousing. Mr. Orr has more than 30 years’ experience in analysis, design, project management, technology planning, and management consulting. He is the author of Structured Systems Development, Structured Requirements Definition, and The One Minute Methodology. He can be reached korr@cutter.com.

For further details, see Ken Orr, “Integrating Enterprise Data Architecture and Enterprise Data Warehousing.” Cutter Consortium Business Intelligence Advisory Service Executive Report, Vol. 3, No. 2, 2003.

VOL. 5, NO. 7

About the Practice

Business Intelligence Practice The strategies and technologies of business intelligence and knowledge management are critical issues enterprises must embrace if they are to remain competitive in the e-business economy. It’s more important than ever to make the right strategic decisions the first time. Cutter Consortium’s Business Intelligence Practice helps companies take all their enterprise data, augment it if appropriate, and turn it into a powerful strategic weapon that enables them to make better business decisions. The practice is unique in that it provides clients with the full picture: technology discussions, product reviews, insight into organizational and cultural issues, and strategic advice across the full spectrum of business intelligence. Clients get the background they need to manage technical issues like data cleansing as well as management issues such as how to encourage employees to participate in knowledge sharing and knowledge management initiatives. From tactics that will help transform your company to a culture that accepts and embraces the value of information, to surveys of the tools available to implement business intelligence initiatives, the Business Intelligence Practice helps clients leverage data into revenue-generating information. Through Cutter’s subscription-based service and consulting, mentoring, and training, clients are ensured opinionated analyses of the latest data warehousing, data mining, knowledge management, CRM, and business intelligence strategies and products. You’ll discover the benefits of implementing these solutions, as well as the pitfalls companies must consider when embracing these technologies. Products and Services Available from the Business Intelligence Practice

• • • • •

The Business Intelligence Advisory Service Consulting Inhouse Workshops Mentoring Research Reports

Other Cutter Consortium Practices

Cutter Consortium aligns its products and services into the nine practice areas below. Each of these practices includes a subscription-based periodical service, plus consulting and training services.

• • • • • • • • •

Agile Software Development and Project Management Business Intelligence Business-IT Strategies Business Technology Trends and Impacts Enterprise Architecture IT Management Measurement and Benchmarking Strategies Enterprise Risk Management and Governance Sourcing and Vendor Relationships

Senior Consultant Team The Senior Consultants on Cutter’s Business Intelligence team are thought leaders in the many disciplines that make up business intelligence. Like all Cutter Consortium Senior Consultants, each has gained a stellar reputation as a trailblazer in his or her field. They have written groundbreaking papers and books, developed methodologies that have been implemented by leading organizations, and continue to study the impact that business intelligence strategies and tactics are having on enterprises worldwide. The team includes:

• • • • • • • • • • • • • • • • •

Verna Allee Stowe Boyd Ken Collier Clive Finkelstein Jonathan Geiger David Gleason Curt Hall André LeClerc Lisa Loftis David Loshin David Marco Larissa T. Moss Ken Orr Raymond Pettit Thomas C. Redman Michael Schmitz Karl M. Wiig