INFORMATION RETRIEVAL - MAKE VS. BUY Paper presented to the Information Systems and Networks Symposium March 27-29, 197H at UCLA, Los Angeles by Roger K. Summit Introduction Online information retrieval is one process in a loop which begins with information producers - the working professional - and ends with information users also working professionals.
Through the process of publication, abstracting,
indexing and dissemination, computer tapes are produced and these data are in turn used for the creation of computer files of abstracts and indexes. Computer programs, such as DIALOG developed by Lockheed, provide a language with which a user can access, brows and retrieve records from these files by specifying desired record content patterns.' Over the past 2-3 years a new industry has emerged - the information retrieval services industry. The industry has three tiers: (l) producers, who produce abstracts and/or tapes of the literature', (2) wholesalers
who store the taped
information on computers and provide access and retrieval capabilities; and (3) the retailers who are information specialists and who work with information end users in identifying, retrieving and delivering desired information. The industry has passed through several phases since its beginning in 1967. Whereas online information retrieval was only available for use on a small number of databases to an elite of government users in 1967 at very high cost, it is now available to diverse groups of government, university and industry personnel for use on a wide variety of data bases for a nominal cost. The future should see more .data bases and still lower costs. Lockheed initiated online retrieval service with the installation of a DIALOG terminal at the NASA/AMES research center in 1967. Since that time, several other
*1-
online document retrieval systems have been developed; most notably those by Mead-Data Corporation, Systems Development Corporation (SDC), Battelle, and New York Times together with variations on these basic systems by Informatics, National Library of Medicine and others. There are presently several major online retrieval services vendors offering access to a wide variety and large volume of data bases. Lockheed is one of the leading suppliers of these services and the examples to be developed are based on experience•in the field. The Economics of Use The potential user today faces three options with regard to online retrieval systems: o
Design and program a new system
o
Lease or purchase existing software for ones own computer
o
Contract for service from an existing vendor
Among federal government agencies each of these three approaches have been pursued. Let us look at the relative economics. NASA contracted with Lockheed in 1968-to design and develop what became known as NASA/RECON,a system that is still in use at NASA. The direct cost of this development can be estimated at $200-$300K. Enhancements and modifications have easily accounted for an additional $700K, putting the final cost at $1 million. The New York Times retrieval system developed by IBM under contract is thought to have cost some $3 million. National Library of Medicine contracted with SDC for an adaptation of ORBIT.
It is likely that with enhancements this system, too
has cost well over $1 million.
It would appear that direct system development is
a costly way to provide online information retrieval services. Purchasing information retrieval services from an existing vendor is the alternative selected by The National Agricultural Library in contracting with Lockheed for retrieval services on its NAL/CAIN data base of agricultural information. Let us examine the economics of this alternative.
The NAL/CAIN data base requires
some UOO million characters of direct access storage.
The monthly cost to store,
maintain, and process requests on this file would be a minimum of the following: $1QK
Computer rental and operations costs (prorated)
UK
Direct access storage rental
2K
Software rental and/or maintenance
$l6K
Minimum total monthly cost
The $l6K is largely independent of the volume of searching done. By operating via the Lockheed system, NAL can be provided 50 hours of online search time per month for less than $1,500., or less than half the cost of the direct access storage alone. What allows such low individual usage costs are the economics of scale associated with offering this and several other data bases not only to NAL, but to many hundreds of other customers as well. Another problem which arises is the dilemma of forced growth which often arises in organizations who plan to offer in-house service.
It is normally uneconomical to
provide online retrieval service on a single data base to a limited number of users.
Thus where service is offered principally on a single data base, as is the
case with several government installations such as NASA, NLM, AEC arid DDC, there is pressure to extend the user population and/or to offer more data bases. This often results in still higher costs, with a need to further extend any existing subsidies or to assess direct user charges in an attempt to recover the costs. To get more customers there is pressure to extend service to other than government users - first to universities and contractors, then to the public at large. By this time the agency offering the retrieval service is faced with the problem of operating a full scale business, often in direct competition with services available from the commercial sector. Such is the dilemma. The commercial information retrieval service obtains this economic critical mass of data bases and users by providing national and even world-wide access to a full spectrum of data bases in the discipline areas it covers.
-3-
Advantages of Buy vs. Make In summary, advantages occur to both user ana data base supplier who takes the buy decision as follows: To the User o
A large variety of data bases are available for his varied information needs
o
Low unit search costs
o
No high start-up costs
o
Cost is completely variable with use
o
Service is available every day for the full day To the Data Base Producer
o
Inexpensive access for the producer to his data base
o
Much broader exposure and use of his data base than he could achieve
o
Availability of complementary data bases both for his own use and to attract other users to whom his data base will be a complementary rather than primary data base
Because of the advantages to both the information supplier and information user offered by the online retrieval services industry, there is economic justification for its existence and continuation.
As a result it is likely that services
will continue to expand and that long term prices will continue to decrease.
.V
• ROGER K, SUMMIT. Program Manager. NAL Information System Service Dr. Summit manages and directs research and development applications of interactive, computer-based information storage, retrieval, management, and analysis systems. An online bibliographic information search and retrieval service is currently being
!
provided under his direction to the National Technical Information Service (NTIS) of the U.S. Department of Commerce.
•.
Previously he was Project Manager on contracts such as RECON —a computer-based online system for use at NASA Centers — and similar systems for AEC, Office of
.
p$* • ——-
Education, and the European Space Research Organization (ESRO). Dr. Summit was also project leader for development of the DIALOG online information retrieval language, and conducted research in simulation and information retrieval systems. Other systems which he designed and implemented include a real-time system for
SET"
collecting and analyzing computation center performance data, and an integrated,
ffi;
computer-based information storage and retrieval system for the LMSC Technical
fmmm
Information Centers.
i !
i
fey.
Prior to joining LMSC in 1960, he was a Research Assistant at Stanford University,
•Ft.
where he designed and programed linear programing, inventory simulation, and queuing models. Dr. Summit attended Stanford University; he received an A . B . in Psychology in 1952, an M. B.A. in Business Administration in 1957, and a Ph.D. in Management Science in 1965. Dr. Summit is the recipient of a special invention award for the Aerospace Business Environment Simulator Computer Program (licensed to IBM). "Dr. Summit i s the author of many papers in information science, online r e t r i e v a l , and man-machine systems."
WW",
i