INFORMATION RETRIEVAL PAST, PRESENT AND FUTURE by ROGER K. SUMMIT
KEYNOTE ADDRESS DELIVERED " AT THE ANNUAL NMLA CONFERENCE ALBUQUERQUE, 30 APRIL 1981
INTRODUCTION Before a detailed look at online information retrieval, it is useful to take a somewhat broader look into the computer and its implications to the library community.
Many of your children will develop the same relationship to com-
puters that most of us developed to books.
The computer will become a primary
means of access to information, knowledge, and entertainment.
Moreover, develop-
ment of skills in using computers is as important to the person of today as learning to read was to our grandparents. It has only been in the past ten years, however, that computers have come to be used extensively for the manipulation of words (as opposed to the manipulation of numbers).
Most people still think of computers as "number
crunchers"rather than as knowledge machines, and they think of computer programmers as some special type of mathematician. nothing could be further from the truth.
For many applications,
Computers are generalized, code
manipulators where the codes can stand for numbers, letters of the alphabet, or even instructions that tell the machine how to manipulate these codes, or how to communicate with a remote user terminal. versed in organizing and manipulating symbols.
Programmers are people
Educationally they come
from a variety of disciplines. The computer is the machine that made possible what we think of as online services today.
Online services simply represent instances where someone
was able to visualize the applicability of modern day data processing techniques to human needs. human being within
The word "online" implies that there is a
the data processing loop who guides and controls the
computer in performing its assigned tasks, as opposed to "batch" processing wherein once initiated, a given set of tasks processes to completion without human intervention.
The beauty and power of online applications result from
the combining of the strengths of humans - decision making, recognition, reasoning - with those of the computers - rapid manipulation of data. One does not have to be a programmer in order to use the power of the computer. Each online application has associated with it a high-level command language which the user employs to divert the computer in its processing.
Behind
each command, however, are vast numbers o machine instructions written by programmers which carry out the high-level command. high-level command language.
-1-
DIALOG is one such
Of the many online applications such as cataloging, statistical modeling, accounting, acquisitions, and word processing, this talk will concentrate on online information retrieval, its origin, its significance to the library community, and trends you can anticipate in the future.
My examples,
for the most, will be taken from the DIALOG information retrieval service. What is Online Retrieval? From the librarian's point of view, online retrieval is simply another in a long list of techniques which has evolved to better control and organize recorded information. Several tools have evolved over the years in response to the changing needs of the reader and the development of new technologies. SLIDE 1 (WORLD. OF LIB. REF. TOOLS) As we all know, the Dewey decimal system provides a correspondence coding between physical shelf location and subject by classifying books into a single subject category, and shelving them accordingly.
The problem is
that a book or document may pertain to several subjects.
The familiar
library card catalog provides an answer to this problem.
As we know in
this case, a document is assigned one or more subject headings.
Cards are
produced for each heading and the cards are sorted together to provide an alphabetical subject index in card form.
But what do you do with journals
or magazines where/each article may pertain to a different subject?
The obvious
answer is to index each article, the result of which can be carried in a card catalog (which is likely to outgrow any space designed to contain i t ) , or to publish the results in a bound volume such as the Reader's Guide to Periodical Literature; thus the book catalog and the evolution of abstracting and indexing services. There are, however, several problems in the retrieval of information from manual systems.
With the Dewey approach of faceted classification, to
discover the proper category to access, the user's point of view must parallel that of the indexer or classifier for retrieval to occur.
If
the document can be classified several different ways, it may be missed entirely.
<#bimjOG
THE WORLD OF LIBRARY REFERENCE TOOLS
CURRENTLY AVAILABLE.
FUTURE CONSIDERATION
DEWEY DECIMAL
STATISTICAL SERVICES
CARD AND BOOK CATALOGS
FACT-FINDING SERVICES
CO I
ABST. AND INDEXING PUBLICATIONS
INTERLIBRARY LOAN
VIEW DATA SYSTEMS
ON LINE RETRIEVAL SERVICES
DOCUMENT FULFILLMENT SERVICES
VIDEO DISK ARCHIVING
Furthermore, for a manual system to be manageable at all, the number of subject headings in the overall classification scheme must be rather severly limited or the problem of discovering the proper category to look under approaches the difficulty of finding the relevant document with no classification.
With a limited number of subject headings, the
number of documents in any single category becomes large and necessitates much serial scanning. Online retrieval systems overcome many of the problems of manual search systems through value-added processing of the tapes received from abstracting and indexing organizations wherein detailed indexes are created from the titles and abstracts of the reference materials (which will be explored in greater detail later). plexity.
The cost of this greater power, however, is com-
Manual systems can be used on a largely intuitive basis and
require little instruction.
Communicating with a computer in the performance
of a relatively complex task such as information retrieval requires the i
user to learn a specialized command language. A couple of search examples are worth a thousand words. SLIDE 2
(COFFEE SEARCH)
Here we have instructed the computer to search SCISEARCH (Science Citation Index from the Institute for Scientific Information) for a topic involving the effects of coffee or similar stimulant beverages on the heart or stomach.
We set up our search by telling the computer to SELECT (a command)
lists of terms (e.g., coffee, tea, or caffeine) we might expect to find in the titles - in this case - of articles of interest.
These terms are
grouped by concept and the concept groups are combined to define the search. The result is a set of itiems from a collection of over 1 million items searched which corresponds to our topic which we can TYPE or PRINT OFFLINE. We choose to TYPE in the illustration. DIALOG allows us to be more specific than simply specifying the occurrence of words - we can also specify the proximity of words or use word stems as shown in the next example. SLIDE 3 (TAX AVOIDANCE) Again the command is SELECT, but in this case we indicate (with codes) that we want either one of two conditions: the word "tax" is to occur within
? begin
34 23oct80 16:40:20 User3468 $•2.64 0.022 Hrs FiLe34 3 Descriptors FiLe34:SCISEARCH 78-80/WK38 (Copr. ISI Inc.) SEE FILE 94(74-77) Set Items Description ( + = 0 R ; * = A N D ; - = N 0 T )
? select
coffee or tea or caffeine 229 COFFEE 348 TEA 536 CAFFEINE 1 1085 COFFEE OR TEA OR CAFFEINE ? select health or disease? or adverse or toxic? 5399 HEALTH 24342 DISEASE? 465 ADVERSE 6199 TOXIC? 2 36065 HEALTH OR DISEASE? OR ADVERSE OR TOXIC? ? select stomach or heart 1106 STOMACH 7933 HEART 3 9037 STOMACH OR HEART ? combine 1 and 2 and 3 4 6 1 AND 2 AND 3 ? type 4/3/1-4 4/3/1 1387802 OATS ORDERS: JT862 0 REFS IS DRINKING OF COFFEE PERMISSIBLE TO PATIENTS SUFFERING CORONARY HEART-DISEASE (GERMAN) GOTTWIK MG UNIV HEIDELBERG,MED KLIN/D-6900 HEIDELBERG 1//FED REP GER/ MEDIZINISCHE WELT , V 3 1 , N 2 2 , P 8 2 5 - 8 2 7 , 1980
FROM
4/3/2 1147084 OATS O R D E R S : JA139 10 REFS AS INCRIMINATING FACTOR FOR TANNIC-ACID OF TEA AND COFFEE HEART-DISEASE (ENGLISH) PANDA NC; SAHU BK; RAO AG; PANDA SK UNIV COLL VET SCI & ANIM HUSB/ORISSA//INDIA/; AGR TECHNOL/BHUBANESWAR 751003//INDIA/ INDIAN JOURNAL OF NUTRITION AND DIETETICS , V 1 6 , N 9 , P 3 4 8 - 3 5 5 , 1979 4/3/3 0870386 OATS ORDERS: HA106 4 REFS STUDIES ON COMPATIBILITY OF COFFEE CONSUMPTION IN PATIENTS WITH CORONARY HEART-DISEASE (GERMAN) STOCKSMEIER U; BONK S; NICKEL C; SCHROETER M IPR INST SOZIALMED PRAEVENT & REHABIL EV,HOHENBERGSTR 12/D-8132 TUTZING//FED REP GER/ MEDIZINISCHE WELT , V 3 0 , N 2 5 , P 9 7 6 - 9 7 9 , 1979 4/3/4 0448939 OATS ORDERS: FS507 28 REFS COFFEE CONSUMPTION AND MORTALITY TOTAL MORTALITY, STROKE MORTALITY, AND CORONARY HEART-DISEASE MORTALITY (ENGLISH) HEYDEN S; TYROLER HA; HEISS G; HAMES CG; BARTEL A DUKE UNIV,MED CTR,DEPT COMMUNITY S FAMILY M E D / D U R H A M / / N C / 2 7 7 1 0 ; UNIV N CAROLINA,SCH PUBL HLTH,DEPT EPIDEMIOL/CHAPEL H I L L / / N C / 2 7 5 1 4 ; EVANS CTY HEART STUDY,DEPT HLTH/CLAXTON//GA/ ARCHIVES OF INTERNAL MEDICINE , V 1 3 8 , N 1 0 , P 1 4 7 2 - 1 4 7 5 , 1978 .cost %7 ;LU
23oct80 16:43:58 D.06 2 Hrs File^A
User3468 9 Descriotnps
-5-
? begin 47 27feb81 15:26:35 User3468
<€TDCTJCG a s a m p l e search.
Fi l e 4 7 : M a g a z i n e Index - 7 7 - 8 1 / F e b ( C o p r . IAC) Set I t e m s D e s c r i p t i o n (+=OR;*=AND;-=NOT> ?
select
t a x ( w ) a v o i d a n c e / t i or t a x ( f ) d e f e r ? 6 TAX(W)AVOIDANCE/TI 10 TAX(F)DEFER? 1 16 T A X ( W ) A V O I D A N C E / T I OR T A X ( F ) D E F E R ? 1/5/1-10
? type 1/5/1 1569595 Hope f o r t h e h i g h f l y e r s , Conway, John A.
(tax-deferred .. = —
stock
option)
(column) ^^Kv^^di?:
Forbes v126 p10(1) Dec 8 1980 CODEN: FORBA ARTICLE TYPE: biography NAMED PEOPLE: Dole, Robert-economic policy; DESCRIPTORS: stocks-taxation; taxation-securities; income tax-securities 1/5/2 1547134 Rare coin investments tax sheltered - tax deferred, (advertisement) Lee, Edward C. Barrons v60 p35(2) Sept 29 1980 CODEN: BRNSA illustration DESCRIPTORS: tax p lanning-ana lysis; coins as an investment-taxation investments-taxation; capital investments-taxation 1/5/3 1514648 Today's tax-sheler maze: find the right route for you: ways to defer ta payments, avoid legal traps, find financial breaks. Scollard, Jeannette Reddish Vogue v170 p47(1) July 1980 CODEN: V0GUB DESCRIPTORS: finance, personal-taxation; tax planning-technique investments-taxation 1/5/4 1474838 Second thoughts. (Universal Life Church tax avoidance) (column) McWi 11i ams, Carey Nation v230 p422(1) April 12 1980 CODEN: NATNB DESCRIPTORS: Universal Life Church-taxation; church and state-taxatic exemption from-moral and religious aspects; Association of America Patriots-political activity 1/5/5 1427025 Tax-deferred exchanging. -6 Ferguson, Ron ~ Real Estate Today v12 p19(4) Dec 1979 CODEN: RESTDR i iI lust rat ion \ D E S CRIPTORS : real property, exchange of -t ax-at i on; tax p I ann i ng-t ec hni qi-
one word "(W)" of "avoidance" and this word pair is to occur in the title (TI); or "tax" is to occur in the same field as words having a stem of "defer" (such as deferred, deferral).
In this case we are searching
Magazine Index, a database of popular magazines, so the references are easily obtained from the public library. Early Information Retrieval Systems But before we get ahead of our story, let us examine some of the historical underpinnings of online retrieval. of-the-art was much differenct.
Less than thirty years ago the state-
One of the earliest successful attemps
at mechanical information retrieval was the Searching Selector at Western Reserve University. SLIDE 4
(SEARCH SELECTOR)
This machine was developed by Allen Kent and J. W. Perry in the mid-1950s. The young woman at the right is in the process of programming the machine to search ten simultaneous questions. rolls of punched paper tape. against each query.
The "library" is encoded into the
The codes for each document are matched
If there is a sufficient match, the document number
is printed out on the Flexowriter.
The size of the "library" is limited
only by the patience of the operator.
This device is one of the earliest
approaches to electro-mechanical, bibliographic information retrieval and is a functional precursor to the batch-search computer systems developed and used with first and second generation computers. Several organizations developed and/or provided batch search services using first and second generation computers in the late 1950s and in the early 1960s including - among others - Western Reserve University, NASA, Defense Documentation Center, the National Library of Medicine, and MIT. Batch-search systems are characterized by tape or punched-card input, printer output, and single job processing.
In batch searching a set of
queries is coded by combining keywords or descriptors in Boolean fashion on punched cards and matching each query against each record (a bibliographic citation) on a large, sequential tape file.
If there is a match, the
record is listed; if not, the search proceeds to the next record. The deficiencies of batch searching are widely known; if the query is too specific,
(i.e., too many 'ands') the result is likely to be zero; if
slightly less specific, a large portion of a large file can be printed. -7-
vSeMckk] vSefatw
The Western Reserve University Searching Selector is programmed for a search of encoded metallurgical literature. Unit at left is a Flexowriter, which "reads" the encoded "library" from punched payer tape. The unit at the right is programmed for simultaneous search of ten questions. Documents from the "library" bearing on the questions are then identified when the Flexowriter automatically types out the serial number corresponding to those documents.
Any subsequent revision of the search must be resubmitted and reprocessed, often requiring days or weeks to get the next set of results.
Furthermore,
the processing time for a batch of searches on one of the computers of the day could take several hours to complete.
NASA, one of the leaders of
the day in sponsoring the development of retrieval systems, required - for example - eight hours on an IBM 1410 to process a batch of searches against its file of 200,000 aerospace reports citations. DIALOG Development It was within this context of laborous and inefficient batch searching that the idea of DIALOG was born. •
The King report
Several critical events occurred in 1964:
"Automation and the Library of Congress"
(published in 1963) was reviewed at Lockheed where it tickled the imagination of a Lockheed executive vice president (2) •
A prototype online retrieval system called CONVERSE
had been
successfully tested at Lockheed which utilized an RCA computer with random access disks and an online input device •
IBM announced its third generation computer series, the IBM/360 in April of that year
•
A proposal "LMSC Information Storage and Retrieval Study Plan" was submitted to management in April 1964
The King report provided the precipitant stimulus; the CONVERSE experiment provided demonstrable credibility; the IBM third generation hardware with mass random access storage and interactive processing capability provided the means; and the proposal to Lockheed management resulted in the reorganization and funding necessary to establish the Lockheed Information Retrieval Laboratory in January of 1965.
The objective of the new organization was
stated in the proposal as follows: "Establish a program to investigate the information retrieval problem. Initial emphasis will be placed on a particular problem area, namely, the library problem which includes storage, retrieval, dissemination and display elements which are applicable to other information retrieval areas."
-9-
An IBM 360/30 computer with 32K of internal memory, a 400 megabyte data cell mass storage device, and two 5.5 megabyte disks were installed in November of 1965.
This configuration, except for the mass storage devices, was little
more powerful than the microprocessors of today, and was to be the development environment for DIALOG. SLIDE 5 (ORIGINAL COMPUTER) The design of DIALOG originated from a belief that proper use of third generation computer technology could overcome most, if not all, of the problems of manual and mechanical retrieval approaches described previously.
One
of the problems with batch searching was the necessity to formulate extremely complex expressions for all but the simplest searches.
Considering
overall system requirements, why not design a language that would allow the searcher to break down a complex formulation, into a series of simpler steps which could be executed one at a time, and then provide a facility to combine the results of the individual steps into the more complex formulation?
If the technique works in problem solving, perhaps it would work in
searching.
Furthermore, such a design philosophy would allow the searcher
to know the quantitative result of each step which could help to guide formulation of the next step. lb assist in vocabulary selection, it was decided to provide some form of alphabetical index display showing posting counts and related term counts. Furthermore it seemed desirable to allow the searcher to display "hits" at any time to provide for intermediate validation and/or search redirection. To facilitate thie "cut and try" philosophy, it was necessary to save the results of each search statement, and to allow this result to be treated as a single element in any subsequent statement.
This feature
of recursion, incidentally, is probably the single most powerful aspect of online searching, and yet it was the most controversial in our early design discussions.
Some of the group argued for the saving of only a single
resultant set for sake of economy (the approach taken by Data Central later acquired by Mead).
Finally, the system must provide a quick response,
and allow the results of any s=arch to be printed in bulk offline.
Finally,
the procedures had to be simple to understand and efficient in execution. If a system could be designed to fulfill these objectives, it was felt that we could leap-frog over the existing batch-oriented systems, and -10-
establish ourselves as a predominant force in the then infant field of information retrieval. But how to translate these general requirements into procedural practice? The first consideration was the design of the language.
It must be formu-
lated in a manner that is easy to learn and yet powerful in result. Computer languages of the time were oriented to number manipulation, and were relatively complex and difficult to learn.
But why not design
"commands" to be used by the non-specialist that would themselves call up detailed computer programs to perform the specified operation?
The
idea of providing a command to define a functional processing step, together with an operand or data string which would tell the command what and how to process a given set of data seemed to have the power and generality that were required.
As it turned out, there needed to be just five basic commands
as shown in the next slide. SLIDE 6 (EARLY COMMANDS) In parallel with design of the commands, the overall file structure had to be considered.
Sequential search techniques as used in batch systems
simply could not perform responsively in an online system. was an inverted file structure, which was selected.
The alternative
An inverted file is
like a concordance or a back-of-the-book index wherein every word is associated with a list of record numbers (accession numbers) of the citations which contain that word.
It is produced by processing the
sequential or linear file to extract accession number/keywcrd pairs.
t
f
These pairs are then sorted into word order, with the accession numbers for each word being placed in a list or string.
Such an arrangement would
allow a Boolean query for Russia and satellites, for example, to be processed by matching the string of accession numbers associated with "Russia"with the string associated with 'satellites." If the same accession number appears in both strings, the citation it identifies must contain both words. An OR condition is accomplished by merging the strings; a NOT by merging the strings and removing common accession numbers. statement is a set of accession numbers. -12-
The result of any query
,
EARLY DIALOG COMMANDS
I
BEGIN -
TO INITIALIZE THE USER AREA
EXPAND (TERM)
TO DISPLAY THE ALPHABETICALLY NEAR TERMS WITH POSTING COUNTS TO AN INPUT TERM
SELECT (TERM) -
TO INDICATE TERMS TO BE USED IN SEARCH. SUCH TERMS WERE ASSIGNED SET NUMBERS FOR EASY REFERENCE LATER
COMBINE (SETS)
TO PROVIDE FOR BOOLEAN COMBINATION OF SETS. THE RESULT OF A COMBINE WAS A SET THAT COULD BE USED IN SUBSEQUENT COMBINES
DISPLAY) TYPE ) (SET/FMT/ITEMS) PRINT )
TO OUTPUT INDIVIDUAL CITATIONS/ABSTRACTS TO CRT'S, TYPEWRITERS, OR OFFLINE PRINTOUTS, RESPECTIVELY
END
ENDED THE. SEARCH AND REQUESTED THE USER TO ENTER AN EVALUATION OF THE SEARCH RESULTS
Such an approaoh would not only allow for rapid searching, but would also provide the user with a count of the hits prior to any access to the sequential (linear) file. Furthermore, utilizing the random access capability of disk storage devices, we could build an index to the linear file which would allow any item to be immediately called up and displayed. Another thought - why not build an index to the inverted file as well which would allow the indexing vocabulary to be displayed to assist the user in selecting terms. Finally, if the index would contain the word count, the user would have an immediate idea of the utility of the word as a search word.
How elegant!
It all came together one morning in January
of 1966; the preliminary design specification for DIALOG had been completed. Key System Development Milestones Between 1966 and 1970 there were five key events which provided the DIALOG project internal viability within Lockheed, and' external visibility around the world: •
NASA prototype and development contracts
•
COSATI panel demonstration
•
ESRO and AEC development contracts
•
ERIC services contract
•
ASIS exhibit
With the encouragement of Mel Day, we submitted a proposal to NASA in early 1966 for a prototype online retrieval service which resulted in the award of a $20K contract to install and operate a remote terminal at Ames Research center utilizing, DIALOG to access the NASA file of 260K citations.
DIALQG, first became operational on the file in November 1966
and the remote terminal, an IBM 2260 display terminal, was installed at NASA in April 1967. The controller for the display, incidentally, was too large to be transported up the staircase to the second story library and had to be installed by knocking out a window casing and raising the controller by crane. The installation cost nearly equalled the cost of the project itself. The results of this project are reported in Reference 3.
-14-
Online retrieval proved to be popular among the scientific staff at Ames. One of the few problems which arose came from a librarian who complained that there was so much demand for searching, she had been forced to forego a committee meeting and several coffee breaks to keep up with the backlog. The NASA/Ames prototype contract was a key event in that it not only established the viability of the concept with Lockheed management, but it proved that people would voluntarily use a terminal to communicate with a computer to retrieve information.
We learned that both librarians
and engineers could understand the use of Boolean operators (and, or, not) in searching a database.
Moreover, this contract gave the project the
exposure needed to attract future opportunities.
The NASA/RECON develop-
ment contract first suggested a realistic possibility of information retrieval as a formal line of business at Lockheed.
In 1968 COSATI (the Government Interagency Committee on Scientific and Technical Information) invited several online retrieval systems to demonstrate their capabilities on a file of project descriptors, and produced a film of the-demonstrations under the auspices of Battelle Memorial Institute. The COSATI demonstration could be likened to an invitational state-of-the-art conference.
The conference was attended by Lockheed, Mead, SDC, and Computer
Corporation of America. Only Mead and Lockheed demonstrated online retrieval systems. The SDC and CCA systems were forerunners of database management systems. While viewing, remember this was 1968 state-of-the-art.
It should
be noted the extent to which the approach has survived and has reappeared in other subsequently developed systems. Also note the foresight of the narrator at the end of the film in predicting international networks of computers and users. (COSATI FILM)
-15-
As a result of the COSATI demonstration in 1968, Harvey Marron of the U.S. Office of Education (USOE) became convinced that their ERIC file of educational research would be well-served by this new technology.
A series of
contracts initiated! services on this database to some six USOE-sponsored terminal sites between 1970 and 1972. These contracts marked the shift in our emphasis from that of systems developer to that of service vendor. The European Space Research Organization and Atomic Energy Commission software installation contracts were our final development contracts, and they reinforced our decision to shift to services in that they diverted significant amounts of human resources from database loading and system enhancements which were needed to support the services environment.
Online had in no way yet captured the imagination of the information community, however.
The 1969 ASIS Meeting in San Francisco included a special online
bazaar which was not nearly as interesting to attendees as Doug Englebart's word-processing system (Augumented Human Intellect) which was also demonstrated. It was"soon after this meeting that the first database supplier contract was struck for a commercial database called PANDEX.
I remember the negotiation -
Dick Kolin of Crowell,Collier and Macmillan wanted considerable up-front money and a percentage of gross for any use of PANDEX online - probably a result of his role as a New York publisher.
We finally settled for royalties
of something on the order of $10 per hour and 5* per offline print. Little would either of us know we were setting an industry contracting and pricing standard. Key Milestones in the 1970s In 1970 DIALOG began to provide a true online retrieval service. In addition to the ERIC centers at Stanford University and in Wash., D.C. (ERIC contained 12,000 citations at the time), we initiated an inhouse service at Lockheed on the AEC, NASA, and PANDEX files.
-16-
1970 included several other interesting events: •
First Computer Communications, Inc. (CCI) terminal arrived (heavy, but portable - allowed dialup)
•
ERIC was demonstrated at The White House Conference on Children (55K citations)
•
First transoceanic demonstration of online retrieval held (Paris to Palo Alto searching the Nuclear Science Abstracts database)
By most measures, 1970 was the year - one decade ago- that marked the true beginning of third party online retrieval service (i.e., service to organizations who were neither the supplier of the database nor the operator of the computer system).
Developments followed thick and fast during the
early 1970s.
1971 — •
Free-text or proximity searching was added to DIALOG
•
Systems Development Corporation (SDC) survey on the potential of online services was conducted
•
Two additional ERIC sites were added - C.E.C. and RISE
•
Council for Exceptional Children (C.E.C.) database was added to DIALOG
I remember my shock at seeing the SDC survey sent by Carlos Cuadra to some 8,000 organizations to inquire of their interest in online searching which databases, how much would they pay, etc. The "secret" of the potential fof;online searching was out, it appeared.
This survey served to convince
my management at Lockheed that competition was nipping at our heels and that we must redouble our efforts to maintain our position in the field. Ironically it turned out, unbeknownst to us at the time, that only 80 of the 8,000 questionnaires were ever returned, and that SDC almost decided not to enter the online arena as a result.
-17-
1972 •
Search-save feature was added to DIALOG
•
Dialup service initiated
•
CALSPAN and GE San Jose signed up for service
•
Competitive National Technical Information Service (NTIS) award was received for service to 5 terminals
A project report written by Bob Donati, then and still Manager of our New York Office, summarized 1972 achievements: 1 Jan. 1972
31 Dec. 1972
Number of terminals
6
16
Subscribing organizations
5
13
10
35
100K
900K
Daily search hours Records online
1973 •
Several databases were added: TRANDEX from C.C.M., Psychological Abstracts from the American Psychological Association, AGRICOLA (a name adopted much later) from the National Agricultural Library, and Science Abstracts from INSPEC
•
First advertisement was prepared - 15 exposures during 1973
•
First Users' Meeting held in our New York Office
•
Washington Office opened with Rick Caputo
(SLIDE 7)
The most significant development in 1973, however, was the inter-connection of DIALOG and the TYMSHARE communications network in 1973. This connection provided a means for potential customers in over 40 U.S. cities to access DIALOG through a local telephone call at a flat charge of $10.00 per hour for telecommunications.
The significance of such a service is that it
ultimately has allowed the concentration of world-wide demand at a single processing center.
Such a concentration provides the rationale for the
offering of many small and specialized databases which otherwise would not be viable. This TYMSHAEE service must be noted as one of the critical . occurrences in the evolution of online searching.
•18-
1974 1974 saw a continuing buildup of databases with the addition of Predicasts, ISI, IFI/PLENUM, and Chemical Abstracts databases. The final event of importance in 1974 was the award of a grant from the National Science Foundation to study the utility of online searching in the public library.
The significance of this grant was that it provided a
potential avenue for use of online retrieval services by the general public. (Incidentally,
we still have copies of the final report of this study
available at no charge if you will contact me or give me your card.) Present Service From its humble beginning on the IBM 360/30 computer, the service has grown until it now is operated by two large-scale IBM computers with access to over 150 disk drives.
For those of you who have not had a change to
visit us in Palo Alto, the next several slides provide a quick tour of our Palo Alto facility. FACILITY SLIDES •
BUILDING
•
CUSTOMER SERVICES
•
TRAINING
•
PUBLICATIONS
•
ACCOUNTING
•
PROGRAMMING
•
COMPUTERS
•
DISK STORAGE DATABASE SLIDES CUSTOMER PROFILE COUNTRY PROFILE GROWTH SUMMARY
-20-
INFORMATION RETRIEVAL SERVICE
\
O
\\ File No.
\
0 » 5»
Or \\
\ DATABASE (Supplier)
1^
\
O >*
'
-£• \ \
\\
% \
\
$90 90 55 75 15 55 60 60 35 60 45 60 75 95 n/a 60 90
15* 15 12 15 n/a 15 30 30 10 30 10 30 10 10 n/a 15 20
$75 60 90 65 90 90
20* 20 $3 20 50 50
45 90 70 90 90 90 90 90 90 90 90 90 90 45 45 45
25 20 15 20 20 20 20 20 20 20 20 20 20 25 50 25
\ \
MULTIDISCIPLINARY 102 101 35 77 200 114 26 27 66 85 47 78 111 211 911 49 65
*
•
ASI (Congressional Information Service, Inc.) CIS/INDEX (Congressional Information Service, Inc.) COMPREHENSIVE DISSERTATION ABS. (Univ. Microfilms Inc.) CONFERENCE PAPERS INDEX (Data Courier, Inc.) DIALOG PUBLICATIONS (DIALOG Information Retrieval Service ENCYCLOPEDIA OF ASSOCIATIONS (Gale Research Company) FOUNDATION DIRECTORY (The Foundation Center) FOUNDATION GRANTS INDEX (The Foundation Center) GPO MONTHLY CATALOG (U.S. Government Printing Office) GRANTS DATABASE (Oryx Press) MAGAZINE INDEX (Information Access Corp.) NATIONAL FOUNDATIONS (The Foundation Center) NATIONAL NEWSPAPER INDEX (Information Access Corporation) NEWSEARCH (Information Access Corporation) NEWSEARCH (Information Access Corporation) PAIS INTERNATIONAL (Public Affairs Information Service, Inc.) SSIE CURRENT RESEARCH (Smithsonian Science Info. Exchange)
BUSINESS/ECONOMICS 15 19 100 90 22 92 105 59 75 42 20 98 18 84 83 17 16 81 82 106 107 126
ABI/INFORM (Data Courier, Inc.) CHEMICAL INDUSTRY NOTES (American Chemical Society) DISCLOSURE (Disclosure Incorporated) ECONOMICS ABSTRACTS INTERNATIONAL (Learned Information Ltd.) EIS INDUSTRIAL PLANTS (Economic Information Systems, Inc.) EIS NONMANUFACTURING ESTABLISHMENTS (Economic Information Systems, Inc.) FOREIGN TRADERS INDEX (U.S. Department of Commerce) FROST & SULLIVAN DM 2 (Frost & Sullivan) MANAGEMENT CONTENTS® (Management Contents, Inc.) PHARMACEUTICAL NEWS INDEX (Data Courier, Inc.) PTS FEDERAL INDEX (Predicasts, Inc.) PTS F&S INDEXES 1972-1975 (Predicasts, Inc.)* PTS F&S INDEXES 1976-present (Predicasts, Inc.)* PTS INTERNATIONAL TIME SERIES (Predicasts, Inc.)* PTS INTERNATIONAL FORECASTS (Predicasts, Inc.)* PTS PREDALERT (Predicasts, Inc.)* PTS PROMT (Predicasts, Inc.)* PTS U.S. FORECASTS (Predicasts, Inc.)* PTS U.S. TIME SERIES (Predicasts, Inc.)* TRADE OPPORTUNITIES (U.S. Department of Commerce) TRADE OPPORTUNITIES WEEKLY (U.S. Department of Commerce) U.S. EXPORTS (U.S. DeDartment of Commerce)
* Not Yet Available]
-21-
i
<&U IflLOG INFORMATION RETRIEVAL SERVICE
9. • 3> o **.
2 * ^ <» -A
File No.
<J,
DATABASE (Supplier) APPLIED SCIENCE & TECHNOLOGY
45 44 112 116 96
23 223 224 225 124 24 25
125 8 60 69 40 68 51 79 123 74 14 32 118 6 28 48 41 95 115 63 99 33 67
APTIC (Air Pollution Tech. Info. Ctr. & the Franklin Institute) AQUATIC SCIENCE & FISHERIES ABSTRACTS (NOAA) AQUACULTURE (National Oceanic and Atmospheric Administration) AQUALINE (Water Research Centre) BHRA FLUID ENGINEERING (British Hydromechanics Research Association) CLAIMS™/CHEM 1950-1970 (IFI/Plenum Data Company) CLAIMS™/CHEM/UNITERM 1950-1970 (JFI/Plenum Data Company) CLAIMS™/CHEM/UNITERM 1971-1977 (IFI/Plenum Data Company) CLAIMS™/CHEM/UNITERM 1978.- present (IFI/Plenum Data Company) CLAIMS™/CLASS (IFI/Plenum Data Company) CLAIMS™/U.S. PATENTS 1971-1977 (IFI/Plenum Data Company) CLAIMS™/U.S. PATENT ABSTRACTS 1978-present (IFI/Plenum Data Company) CLAIMS™/U.S. PATENT ABSTRACTS WEEKLY (IFI/Plenum Data Company) COMPENDEX (Engineering Index, Inc.) CRIS USDA (USDA) ENERGYLINE® (Environment Information Center, Inc.) ENVIROLINE® (Environment Information Center, Inc.) ENVIRONMENTAL BIBLIOGRAPHY (Internatl. Acad, at Santa Barbara) FOOD SCIENCE AND TECHNOLOGY ABS. (Intl. Food Info. Service) FOODS ADLIBRA (K&M Publications. Inc.) INPADOC (International Patent Documentation Center) INTERNATIONAL PHARMACEUTICAL ABS. (Am. Soc. of Hospital Pharmacists) ISMEC (Data Courier, Inc.) METADEX (American Society for Metals) t NON-FERROUS METALS ABSTRACTS (British Non-Ferrous Metals Technology Center) NTIS (National Technical Info. Service, U.S. Dept. of Commerce) OCEANIC ABSTRACTS (Data Courier, Inc.) PIRA (Research Assoc, for Paper & Board. Printing & Packaging Indus.) POLLUTION ABSTRACTS (Data Courier, Inc.) RAPRA ABSTRACTS (Rubber amd Plastics Research Association of Great Britain) SURFACE COATINGS ABSTRACTS (Paint Research Association of Great Britain) TRIS (U.S. Department of Transportation and Transportation Research Board) WELDASEARCH (The Welding Institute) WORLD ALUMINUM ABSTRACTS (American Society for Metals) WORLD TEXTILES (Shirley Institute)
• Not Yet Available -22-
$35 35 35 35 65
10* 15 15 30 15
95 300 300 300 90 95 95
15 15 15 15 10 15 50
$95 65 40 90 90 60 65 55 95 50
50* 15 10 20 20 15 15 10 20 15
75 80 45
15 12 20
35 75 55 75 65
10 15 15 15 15
65
15
40
10
65 50 55
15 10 10
\
I f l
^DIALOG INFORMATION RETRIEVAL SERVICE
A
O
\
O
•
10 55 5 2 . 3 4 31 30 131 50 72 73 58 12 13 76 204 231 94 94 34 34 62 52
\
C
File No.
no
**% ^ ^
\
\ <c \ \ t - \ \
•
\
DATABASE (Supplier)
"\
\
\
\
«
\ *
SCIENCE AGRICOLA 1970-1978 (U.S.D.A. Technical Information Systems) AGRICOLA 1979-present (U.S.D.A. Technical Information Systems) BIOSIS PREVIEWS 1969-1973 (Biosciences Information Service) BIOSIS PREVIEWS 1974-present (Biosciences Information Service) CA SEARCH 1967-1971 (American Chemical Society) CA SEARCH 1972-1976 (American Chemical Society) CA SEARCH 1977-present (American Chemical Society) CHEMNAME™ CHEMSEARCH™(Chemical Abstracts Service, DIALOG Information Retrieval Service) CHEMSIS™(Chemical Abstracts Service, DIALOG Information Retrieval Service) , CAB ABSTRACTS (Commonwealth Agricultural Bureaux) EXCERPTA MEDICA (Excerpta Medica) EXCERPTA MEDICA IN PROCESS (Excerpta Medica) GEOARCHIVE (Geosystems) INSPEC 1969-1977 (Institution of Electrical Engineers) INSPEC 1978-present (Institution of Electrical Engineers) IRL LIFE SCIENCES COLLECTION (Information Retrieval Ltd.) ONTAP™ CA SEARCH (American Chemical Society) ONTAP™ CHEMNAME (American Chemical Society) SCISEARCH® 1974-1977 (Institute for Scientific Information) subscriber SCISEARCH® 1974-1977 (Institute for Scientific Information) nonsubscriber SCISEARCH® 1978-present (Institute for Scientific Information) subscriber SCISEARCH® 1978-present (Institute for Scientific Information) nonsubscriber SPIN (American Institute of Physics) TSCA INITIAL INVENTORY (Environmental Protection Agency, DIALOG Information Retrieval Service)
-23-
$25 25 $45 45 70 70 70 70 55
5* 5 10* 10 20 20 20 20 16
70
20
35
25
65 65 70' 55 55 45 15 15 40 130 30 120
25 25 20 15 15 15 n/a n/a 10 20 10 20
35 45
10 15
\
\
WDimiOG INFORMATION RETRIEVAL SERVICE
\
•$•
\°o. \\ % ^ -si-• \ \
\
v» \ \ 4 * **
\
O
\ File No.
^
DATABASE (Supplier)
r
"
\
\
9
% \
\
Cv
\
SOCIAL SCIENCES & HUMANITIES
9 38 56 64 1 54 39 36 61 71 21 46 70 86 201 57 91 11 97 7 37 93 120
AIM/ARM (Center for Vocational Education) AMERICA: HISTORY & LIFE (ABC-Clio, Inc.) ART MODERN (ABC-Clio, Inc.) CHILD ABUSE AND NEGLECT (Natl. Cntr. for Child Abuse and Neglect) ERIC (Educational Resources Information Center) EXCEPTIONAL CHILD ED. RESOURCES (Council for Except. Children) HISTORICAL ABSTRACTS (ABC-Clio, Inc.) LANGUAGE & LANGUAGE BEHAVIOR ABS. (Sociol. Abs., Inc.) LISA (Learned Information Ltd.) MLA BIBLIOGRAPHY (Modern Language Association) NCJRS (National Criminal Justice Reference Service) NICEM (National Information Center for Educational Media) NICSEM/NIMIS (National Info. Cntr. for Special Education Materials) • NIMH (National Clearinghouse for Mental Health Information. National Institute of Mental Health) ONTAP™ ERIC PHILOSOPHER'S INDEX (Philosophy Documentation Center) POPULATION BIBLIOGRAPHY (University of North Carolina, Carolina Population Center) PSYCHOLOGICAL ABSTRACTS (American Psychological Assoc.) RILM ABSTRACTS (City University of New York, International RILM Center) SOCIAL SCISEARCH® (Institute for Scientific Information) SOCIOLOGICAL ABSTRACTS (Sociological Abstracts, Inc.) U.S. POLITCAL SCIENCE DOCUMENTS (Univ. of Pittsburgh, Cntr. for International Studies) U.S. Public School Directory (National Center for Educational Statistics)
•k Not yet available!
-24-
$25 65 60 35 25 25 65 55 50 55 35 70 35 $30
10? 15 15 10 10 10 15 15 10 15 15 20 10 10*
15 55 55
n/a 15 10
65 65
10 15
70 55 65
10 15 15
35
10
TlfAiriG'aiSTQMERS
BUSINESS & INDUSTRIAL FIRMS UNIVERSITY & COLLEGE LIBRARIES GOVERNMENTAL AGENCIES SCHOOL SYSTEMS PROFESSIONAL & NONPROFIT ASSNS INDIVIDUAL PROFESSIONALS PUBLIC LIBRARIES & CONSORTIA
-25-
INFORMATION RETRIEVAL SERVICE 3460 Hillview Avenue Palo Alio. CA 94304 (415)858-2700 TELEX 334499
FOREIGN CUSTOMERS OF DIALOG INFORMATION SERVICES ARABIAN GULF
FINLAND
LEEWARD ISLANDS
ARGENTINA
FRANCE
LUXEMBOURG
AUSTRIA
GERMANY
MALAYSIA
AUSTRALIA
HUNGARY
MEXICO
BAHRAIN
GUADALUPE
NETHERLANDS
BELGIUM
GUATEMALA
NEW ZEALAND
BERMUDA
HONG KONG
NICARAGUA
BRAZIL
INDIA
NORWAY
BR. WEST INDIES
IRAQ
PHILIPPINES
CANADA
IRELAND
PORTUGAL
CHILE
ISRAEL
QUATAR
REPUBLIC OF CHINA
ITALY
PUERTO RICO
COSTA RICA
JAMAICA
SAUDI ARABIA
COLUMBIA
JAPAN
SCOTLAND
DENMARK
KENYA
SINGAPORE
ECUADOR
SO. KOREA
SOUTH AFRICA
EGYPT
KUWAIT
SPAIN
LOCKHEED INFORMATION SYSTEMS/LOCKHEED MISSILES ft SPACE COMPANY. M C
-26-
FEB 2 S JJJgj
d INFORMATION RETRIEVAL SERVICE 3460 Hillviaw Avenu* Palo Alto. CA 94304 (415) 858-2700 TELEX 334499
SWEDEN
SWITZERLAND
SYRIA
TAIWAN
TANZANIA
TASMANIA
THAILAND
r
TRINIDAD
UNITED KINGDOM
VENEZUELA
WALES
WEST INDIES
YUGOSLAVIA
«•> *-•"-'**^v;r*V~» —>":'"•*
-26A-
" "»».• iff !*•*»• •?ST^-""» -C'! >~"*± ..ZZ-) »*•" "'
#bir^joo
DIALOG STATISTICAL HIGHLIGHTS
1973
1978
1980
6
80
110
1.0
21.0
40.0
NUMBER OF DATABASES _'_ MILLIONS CITATIONS/ABSTS COMPUTERS IN SERVICE
360/40
370/165 3032 .
BYTES OF STORACE (BILLIONS) NUMBER OF CUSTOMERS NUMBER OF COUNTRIES
1
1.0 80 1
20,0 4,000 + 32
1980 V S . 1973 18.3 x ,
3033 3032
40.0 x -
40.0
40.0 x
10,000 +
125.0 x
47
47 x
Present and Future Although online retrieval and online cataloging are probably the most familiar applications of
computer/communications technology to most of
us, the whole area is alive with hardware developers, systems innovators, product entrepeneurs and capital.
As Alan Brigish, Editor of VideoPrint
Newsletter observed, the WUMPUS is awake.
(Explain WUMPUS.)
We seem to have reached that critical mass in hardware, software, market awareness, and economics that stimulate heavy venture capital activities. Take, for example, the following recent events: •
Chemical Abstracts Service, long a supplier of databases, has entered the online business with CAS ONLINE (a chemical substructures search service)
•
Information Handling Service, long a publisher of specifications and standards on microfilm, has acquired Bibliographic Retrieval Services (an information retrieval service), and PREDICASTS, a database supplier
•
Radio Suisse, a Swiss communications company (like AT&T) began a joint venture with BRS and PREDICASTS
•
McGraw Hill, a long-time publisher acquired Data Resources, Inc. (DRI), an econometric service
•
Other traditional publishers such as Ziff-Davis, Pergamon, and Elsevier have acquired databases and/or service bureaus
•
The SOURCE - a service for personal computer users was recently bailed out of bankrupcy through purchase by the Reader's Digest
•
EURONET, a Common Market telecommunications combine was put into service to provide cheap telecommunications rates for European-based information retrieval services
•
VIEWDATA, a technique for simple interaction was developed by The British Post Office and is receiving much attention in various guises around the world
•
MEAD has introduced NEXIS - a service of full-text news and magazine article retrieval
-28-
I could go on, but this should provide a flavor of the frantic activity in process to find a marketable combination of information content and delivery technique. With the combination of computer storage and processing coupled with telecommunications, we face a new milestone in the evolution of civilization which, to my mind, is equal in significance to the development of a written language and the invention of the printing press.
What good, after all, is
the recording of knowledge if one cannot identify the existance or source of needed information at the time of need?
Written language allowed society
to record and cumulate knowledge as opposed to each generation having to spend mast of its time and energy in relearning.
The printing press popularized
knowledge and led to public education thus enormously increasing the numbers of potential contributors to technological progress.
This proliferation of
knowledge led to the information explosion which now inundates us with so much information that significant items are often obscured.
The development
of computer-readable language and communications allows us to again become selective in retrieval and thus be freed of the burden of over-specialization necessitated by information overload. But today's developments have even greater significance.
Today's microprocessor
technology is to the computer industry what movable type was to the printing industry.
We can anticipate analogous impacts on society.
As the information specialists in our society, you have the opportunity and even the obligation to be aware of this rapidly changing scene for you are best able to select from this nutritious soup technology is providing, those services, materials and building blocks most appropriate for information transfer.
Information retrieval has had a profound effect on the profession
of librarianship, but it is only the beginning.
The library of 1985 will
become a virtual information system which can provide low-cost access to and on-demand delivery of the world's accumulated store of knowledge.
Effective
utilization of this potential requires an already emerging library professional who is not only versed in information sources and delivery techniques, but also possesses entrepreneurial
qualities of marketing and resource management
necessary for the successful operation of a large-scale service enterprise.
-29-
COMMUNICATION LEVEL •
KNOWLEDGE TRANSFER
NON-VERBAL
LITTLE TRANSFER
VERBAL LANGUAGE
MYTHS, STORY TELLING BEGINNINGS OF EDUCATION
•
WRITTEN
BOOKS AND ARCHIVES LIBRARIES
•
PRINTING PRESS
POPULARIZATION OF KNOWLEDGE BEGINNING OF INFORMATION EXPLOSION INCENTIVE TOWARD SPECIALIZATION
•
COMPUTER READABLE LANGUAGE
AUTOMATIC RETRIEVAL BEGINNINGS OF A KNOWLEDGE MACHINE HIGH TRANSFER EFFICIENCY
o i
LANGUAGE
REFERENCES
(1)
King, Gilbert W., Automation and the Library of Congress, Library of Congress; Washington, D. C , 1963
(2)
"An On-Line Technical Library Reference Retrieval System," American Documentation, January 1966
(3)
Summit, Roger K., Remote Information Retrieval Facility, National Aeronautics and Space Administration (NASA CR-1318), Washington, D. C., April 1969
-31-