Information Retrieval - Past, Present, and Future

Page 1

INFORMATION RETRIEVAL PAST, PRESENT AND FUTURE by ROGER K. SUMMIT

KEYNOTE ADDRESS DELIVERED " AT THE ANNUAL NMLA CONFERENCE ALBUQUERQUE, 30 APRIL 1981


INTRODUCTION Before a detailed look at online information retrieval, it is useful to take a somewhat broader look into the computer and its implications to the library community.

Many of your children will develop the same relationship to com-

puters that most of us developed to books.

The computer will become a primary

means of access to information, knowledge, and entertainment.

Moreover, develop-

ment of skills in using computers is as important to the person of today as learning to read was to our grandparents. It has only been in the past ten years, however, that computers have come to be used extensively for the manipulation of words (as opposed to the manipulation of numbers).

Most people still think of computers as "number

crunchers"rather than as knowledge machines, and they think of computer programmers as some special type of mathematician. nothing could be further from the truth.

For many applications,

Computers are generalized, code

manipulators where the codes can stand for numbers, letters of the alphabet, or even instructions that tell the machine how to manipulate these codes, or how to communicate with a remote user terminal. versed in organizing and manipulating symbols.

Programmers are people

Educationally they come

from a variety of disciplines. The computer is the machine that made possible what we think of as online services today.

Online services simply represent instances where someone

was able to visualize the applicability of modern day data processing techniques to human needs. human being within

The word "online" implies that there is a

the data processing loop who guides and controls the

computer in performing its assigned tasks, as opposed to "batch" processing wherein once initiated, a given set of tasks processes to completion without human intervention.

The beauty and power of online applications result from

the combining of the strengths of humans - decision making, recognition, reasoning - with those of the computers - rapid manipulation of data. One does not have to be a programmer in order to use the power of the computer. Each online application has associated with it a high-level command language which the user employs to divert the computer in its processing.

Behind

each command, however, are vast numbers o machine instructions written by programmers which carry out the high-level command. high-level command language.

-1-

DIALOG is one such


Of the many online applications such as cataloging, statistical modeling, accounting, acquisitions, and word processing, this talk will concentrate on online information retrieval, its origin, its significance to the library community, and trends you can anticipate in the future.

My examples,

for the most, will be taken from the DIALOG information retrieval service. What is Online Retrieval? From the librarian's point of view, online retrieval is simply another in a long list of techniques which has evolved to better control and organize recorded information. Several tools have evolved over the years in response to the changing needs of the reader and the development of new technologies. SLIDE 1 (WORLD. OF LIB. REF. TOOLS) As we all know, the Dewey decimal system provides a correspondence coding between physical shelf location and subject by classifying books into a single subject category, and shelving them accordingly.

The problem is

that a book or document may pertain to several subjects.

The familiar

library card catalog provides an answer to this problem.

As we know in

this case, a document is assigned one or more subject headings.

Cards are

produced for each heading and the cards are sorted together to provide an alphabetical subject index in card form.

But what do you do with journals

or magazines where/each article may pertain to a different subject?

The obvious

answer is to index each article, the result of which can be carried in a card catalog (which is likely to outgrow any space designed to contain i t ) , or to publish the results in a bound volume such as the Reader's Guide to Periodical Literature; thus the book catalog and the evolution of abstracting and indexing services. There are, however, several problems in the retrieval of information from manual systems.

With the Dewey approach of faceted classification, to

discover the proper category to access, the user's point of view must parallel that of the indexer or classifier for retrieval to occur.

If

the document can be classified several different ways, it may be missed entirely.


<#bimjOG

THE WORLD OF LIBRARY REFERENCE TOOLS

CURRENTLY AVAILABLE.

FUTURE CONSIDERATION

DEWEY DECIMAL

STATISTICAL SERVICES

CARD AND BOOK CATALOGS

FACT-FINDING SERVICES

CO I

ABST. AND INDEXING PUBLICATIONS

INTERLIBRARY LOAN

VIEW DATA SYSTEMS

ON LINE RETRIEVAL SERVICES

DOCUMENT FULFILLMENT SERVICES

VIDEO DISK ARCHIVING


Furthermore, for a manual system to be manageable at all, the number of subject headings in the overall classification scheme must be rather severly limited or the problem of discovering the proper category to look under approaches the difficulty of finding the relevant document with no classification.

With a limited number of subject headings, the

number of documents in any single category becomes large and necessitates much serial scanning. Online retrieval systems overcome many of the problems of manual search systems through value-added processing of the tapes received from abstracting and indexing organizations wherein detailed indexes are created from the titles and abstracts of the reference materials (which will be explored in greater detail later). plexity.

The cost of this greater power, however, is com-

Manual systems can be used on a largely intuitive basis and

require little instruction.

Communicating with a computer in the performance

of a relatively complex task such as information retrieval requires the i

user to learn a specialized command language. A couple of search examples are worth a thousand words. SLIDE 2

(COFFEE SEARCH)

Here we have instructed the computer to search SCISEARCH (Science Citation Index from the Institute for Scientific Information) for a topic involving the effects of coffee or similar stimulant beverages on the heart or stomach.

We set up our search by telling the computer to SELECT (a command)

lists of terms (e.g., coffee, tea, or caffeine) we might expect to find in the titles - in this case - of articles of interest.

These terms are

grouped by concept and the concept groups are combined to define the search. The result is a set of itiems from a collection of over 1 million items searched which corresponds to our topic which we can TYPE or PRINT OFFLINE. We choose to TYPE in the illustration. DIALOG allows us to be more specific than simply specifying the occurrence of words - we can also specify the proximity of words or use word stems as shown in the next example. SLIDE 3 (TAX AVOIDANCE) Again the command is SELECT, but in this case we indicate (with codes) that we want either one of two conditions: the word "tax" is to occur within


? begin

34 23oct80 16:40:20 User3468 $•2.64 0.022 Hrs FiLe34 3 Descriptors FiLe34:SCISEARCH 78-80/WK38 (Copr. ISI Inc.) SEE FILE 94(74-77) Set Items Description ( + = 0 R ; * = A N D ; - = N 0 T )

? select

coffee or tea or caffeine 229 COFFEE 348 TEA 536 CAFFEINE 1 1085 COFFEE OR TEA OR CAFFEINE ? select health or disease? or adverse or toxic? 5399 HEALTH 24342 DISEASE? 465 ADVERSE 6199 TOXIC? 2 36065 HEALTH OR DISEASE? OR ADVERSE OR TOXIC? ? select stomach or heart 1106 STOMACH 7933 HEART 3 9037 STOMACH OR HEART ? combine 1 and 2 and 3 4 6 1 AND 2 AND 3 ? type 4/3/1-4 4/3/1 1387802 OATS ORDERS: JT862 0 REFS IS DRINKING OF COFFEE PERMISSIBLE TO PATIENTS SUFFERING CORONARY HEART-DISEASE (GERMAN) GOTTWIK MG UNIV HEIDELBERG,MED KLIN/D-6900 HEIDELBERG 1//FED REP GER/ MEDIZINISCHE WELT , V 3 1 , N 2 2 , P 8 2 5 - 8 2 7 , 1980

FROM

4/3/2 1147084 OATS O R D E R S : JA139 10 REFS AS INCRIMINATING FACTOR FOR TANNIC-ACID OF TEA AND COFFEE HEART-DISEASE (ENGLISH) PANDA NC; SAHU BK; RAO AG; PANDA SK UNIV COLL VET SCI & ANIM HUSB/ORISSA//INDIA/; AGR TECHNOL/BHUBANESWAR 751003//INDIA/ INDIAN JOURNAL OF NUTRITION AND DIETETICS , V 1 6 , N 9 , P 3 4 8 - 3 5 5 , 1979 4/3/3 0870386 OATS ORDERS: HA106 4 REFS STUDIES ON COMPATIBILITY OF COFFEE CONSUMPTION IN PATIENTS WITH CORONARY HEART-DISEASE (GERMAN) STOCKSMEIER U; BONK S; NICKEL C; SCHROETER M IPR INST SOZIALMED PRAEVENT & REHABIL EV,HOHENBERGSTR 12/D-8132 TUTZING//FED REP GER/ MEDIZINISCHE WELT , V 3 0 , N 2 5 , P 9 7 6 - 9 7 9 , 1979 4/3/4 0448939 OATS ORDERS: FS507 28 REFS COFFEE CONSUMPTION AND MORTALITY TOTAL MORTALITY, STROKE MORTALITY, AND CORONARY HEART-DISEASE MORTALITY (ENGLISH) HEYDEN S; TYROLER HA; HEISS G; HAMES CG; BARTEL A DUKE UNIV,MED CTR,DEPT COMMUNITY S FAMILY M E D / D U R H A M / / N C / 2 7 7 1 0 ; UNIV N CAROLINA,SCH PUBL HLTH,DEPT EPIDEMIOL/CHAPEL H I L L / / N C / 2 7 5 1 4 ; EVANS CTY HEART STUDY,DEPT HLTH/CLAXTON//GA/ ARCHIVES OF INTERNAL MEDICINE , V 1 3 8 , N 1 0 , P 1 4 7 2 - 1 4 7 5 , 1978 .cost %7 ;LU

23oct80 16:43:58 D.06 2 Hrs File^A

User3468 9 Descriotnps

-5-


? begin 47 27feb81 15:26:35 User3468

<€TDCTJCG a s a m p l e search.

Fi l e 4 7 : M a g a z i n e Index - 7 7 - 8 1 / F e b ( C o p r . IAC) Set I t e m s D e s c r i p t i o n (+=OR;*=AND;-=NOT> ?

select

t a x ( w ) a v o i d a n c e / t i or t a x ( f ) d e f e r ? 6 TAX(W)AVOIDANCE/TI 10 TAX(F)DEFER? 1 16 T A X ( W ) A V O I D A N C E / T I OR T A X ( F ) D E F E R ? 1/5/1-10

? type 1/5/1 1569595 Hope f o r t h e h i g h f l y e r s , Conway, John A.

(tax-deferred .. = —

stock

option)

(column) ^^Kv^^di?:

Forbes v126 p10(1) Dec 8 1980 CODEN: FORBA ARTICLE TYPE: biography NAMED PEOPLE: Dole, Robert-economic policy; DESCRIPTORS: stocks-taxation; taxation-securities; income tax-securities 1/5/2 1547134 Rare coin investments tax sheltered - tax deferred, (advertisement) Lee, Edward C. Barrons v60 p35(2) Sept 29 1980 CODEN: BRNSA illustration DESCRIPTORS: tax p lanning-ana lysis; coins as an investment-taxation investments-taxation; capital investments-taxation 1/5/3 1514648 Today's tax-sheler maze: find the right route for you: ways to defer ta payments, avoid legal traps, find financial breaks. Scollard, Jeannette Reddish Vogue v170 p47(1) July 1980 CODEN: V0GUB DESCRIPTORS: finance, personal-taxation; tax planning-technique investments-taxation 1/5/4 1474838 Second thoughts. (Universal Life Church tax avoidance) (column) McWi 11i ams, Carey Nation v230 p422(1) April 12 1980 CODEN: NATNB DESCRIPTORS: Universal Life Church-taxation; church and state-taxatic exemption from-moral and religious aspects; Association of America Patriots-political activity 1/5/5 1427025 Tax-deferred exchanging. -6 Ferguson, Ron ~ Real Estate Today v12 p19(4) Dec 1979 CODEN: RESTDR i iI lust rat ion \ D E S CRIPTORS : real property, exchange of -t ax-at i on; tax p I ann i ng-t ec hni qi-


one word "(W)" of "avoidance" and this word pair is to occur in the title (TI); or "tax" is to occur in the same field as words having a stem of "defer" (such as deferred, deferral).

In this case we are searching

Magazine Index, a database of popular magazines, so the references are easily obtained from the public library. Early Information Retrieval Systems But before we get ahead of our story, let us examine some of the historical underpinnings of online retrieval. of-the-art was much differenct.

Less than thirty years ago the state-

One of the earliest successful attemps

at mechanical information retrieval was the Searching Selector at Western Reserve University. SLIDE 4

(SEARCH SELECTOR)

This machine was developed by Allen Kent and J. W. Perry in the mid-1950s. The young woman at the right is in the process of programming the machine to search ten simultaneous questions. rolls of punched paper tape. against each query.

The "library" is encoded into the

The codes for each document are matched

If there is a sufficient match, the document number

is printed out on the Flexowriter.

The size of the "library" is limited

only by the patience of the operator.

This device is one of the earliest

approaches to electro-mechanical, bibliographic information retrieval and is a functional precursor to the batch-search computer systems developed and used with first and second generation computers. Several organizations developed and/or provided batch search services using first and second generation computers in the late 1950s and in the early 1960s including - among others - Western Reserve University, NASA, Defense Documentation Center, the National Library of Medicine, and MIT. Batch-search systems are characterized by tape or punched-card input, printer output, and single job processing.

In batch searching a set of

queries is coded by combining keywords or descriptors in Boolean fashion on punched cards and matching each query against each record (a bibliographic citation) on a large, sequential tape file.

If there is a match, the

record is listed; if not, the search proceeds to the next record. The deficiencies of batch searching are widely known; if the query is too specific,

(i.e., too many 'ands') the result is likely to be zero; if

slightly less specific, a large portion of a large file can be printed. -7-


vSeMckk] vSefatw

The Western Reserve University Searching Selector is programmed for a search of encoded metallurgical literature. Unit at left is a Flexowriter, which "reads" the encoded "library" from punched payer tape. The unit at the right is programmed for simultaneous search of ten questions. Documents from the "library" bearing on the questions are then identified when the Flexowriter automatically types out the serial number corresponding to those documents.


Any subsequent revision of the search must be resubmitted and reprocessed, often requiring days or weeks to get the next set of results.

Furthermore,

the processing time for a batch of searches on one of the computers of the day could take several hours to complete.

NASA, one of the leaders of

the day in sponsoring the development of retrieval systems, required - for example - eight hours on an IBM 1410 to process a batch of searches against its file of 200,000 aerospace reports citations. DIALOG Development It was within this context of laborous and inefficient batch searching that the idea of DIALOG was born. •

The King report

Several critical events occurred in 1964:

"Automation and the Library of Congress"

(published in 1963) was reviewed at Lockheed where it tickled the imagination of a Lockheed executive vice president (2) •

A prototype online retrieval system called CONVERSE

had been

successfully tested at Lockheed which utilized an RCA computer with random access disks and an online input device •

IBM announced its third generation computer series, the IBM/360 in April of that year

A proposal "LMSC Information Storage and Retrieval Study Plan" was submitted to management in April 1964

The King report provided the precipitant stimulus; the CONVERSE experiment provided demonstrable credibility; the IBM third generation hardware with mass random access storage and interactive processing capability provided the means; and the proposal to Lockheed management resulted in the reorganization and funding necessary to establish the Lockheed Information Retrieval Laboratory in January of 1965.

The objective of the new organization was

stated in the proposal as follows: "Establish a program to investigate the information retrieval problem. Initial emphasis will be placed on a particular problem area, namely, the library problem which includes storage, retrieval, dissemination and display elements which are applicable to other information retrieval areas."

-9-


An IBM 360/30 computer with 32K of internal memory, a 400 megabyte data cell mass storage device, and two 5.5 megabyte disks were installed in November of 1965.

This configuration, except for the mass storage devices, was little

more powerful than the microprocessors of today, and was to be the development environment for DIALOG. SLIDE 5 (ORIGINAL COMPUTER) The design of DIALOG originated from a belief that proper use of third generation computer technology could overcome most, if not all, of the problems of manual and mechanical retrieval approaches described previously.

One

of the problems with batch searching was the necessity to formulate extremely complex expressions for all but the simplest searches.

Considering

overall system requirements, why not design a language that would allow the searcher to break down a complex formulation, into a series of simpler steps which could be executed one at a time, and then provide a facility to combine the results of the individual steps into the more complex formulation?

If the technique works in problem solving, perhaps it would work in

searching.

Furthermore, such a design philosophy would allow the searcher

to know the quantitative result of each step which could help to guide formulation of the next step. lb assist in vocabulary selection, it was decided to provide some form of alphabetical index display showing posting counts and related term counts. Furthermore it seemed desirable to allow the searcher to display "hits" at any time to provide for intermediate validation and/or search redirection. To facilitate thie "cut and try" philosophy, it was necessary to save the results of each search statement, and to allow this result to be treated as a single element in any subsequent statement.

This feature

of recursion, incidentally, is probably the single most powerful aspect of online searching, and yet it was the most controversial in our early design discussions.

Some of the group argued for the saving of only a single

resultant set for sake of economy (the approach taken by Data Central later acquired by Mead).

Finally, the system must provide a quick response,

and allow the results of any s=arch to be printed in bulk offline.

Finally,

the procedures had to be simple to understand and efficient in execution. If a system could be designed to fulfill these objectives, it was felt that we could leap-frog over the existing batch-oriented systems, and -10-



establish ourselves as a predominant force in the then infant field of information retrieval. But how to translate these general requirements into procedural practice? The first consideration was the design of the language.

It must be formu-

lated in a manner that is easy to learn and yet powerful in result. Computer languages of the time were oriented to number manipulation, and were relatively complex and difficult to learn.

But why not design

"commands" to be used by the non-specialist that would themselves call up detailed computer programs to perform the specified operation?

The

idea of providing a command to define a functional processing step, together with an operand or data string which would tell the command what and how to process a given set of data seemed to have the power and generality that were required.

As it turned out, there needed to be just five basic commands

as shown in the next slide. SLIDE 6 (EARLY COMMANDS) In parallel with design of the commands, the overall file structure had to be considered.

Sequential search techniques as used in batch systems

simply could not perform responsively in an online system. was an inverted file structure, which was selected.

The alternative

An inverted file is

like a concordance or a back-of-the-book index wherein every word is associated with a list of record numbers (accession numbers) of the citations which contain that word.

It is produced by processing the

sequential or linear file to extract accession number/keywcrd pairs.

t

f

These pairs are then sorted into word order, with the accession numbers for each word being placed in a list or string.

Such an arrangement would

allow a Boolean query for Russia and satellites, for example, to be processed by matching the string of accession numbers associated with "Russia"with the string associated with 'satellites." If the same accession number appears in both strings, the citation it identifies must contain both words. An OR condition is accomplished by merging the strings; a NOT by merging the strings and removing common accession numbers. statement is a set of accession numbers. -12-

The result of any query


,

EARLY DIALOG COMMANDS

I

BEGIN -

TO INITIALIZE THE USER AREA

EXPAND (TERM)

TO DISPLAY THE ALPHABETICALLY NEAR TERMS WITH POSTING COUNTS TO AN INPUT TERM

SELECT (TERM) -

TO INDICATE TERMS TO BE USED IN SEARCH. SUCH TERMS WERE ASSIGNED SET NUMBERS FOR EASY REFERENCE LATER

COMBINE (SETS)

TO PROVIDE FOR BOOLEAN COMBINATION OF SETS. THE RESULT OF A COMBINE WAS A SET THAT COULD BE USED IN SUBSEQUENT COMBINES

DISPLAY) TYPE ) (SET/FMT/ITEMS) PRINT )

TO OUTPUT INDIVIDUAL CITATIONS/ABSTRACTS TO CRT'S, TYPEWRITERS, OR OFFLINE PRINTOUTS, RESPECTIVELY

END

ENDED THE. SEARCH AND REQUESTED THE USER TO ENTER AN EVALUATION OF THE SEARCH RESULTS


Such an approaoh would not only allow for rapid searching, but would also provide the user with a count of the hits prior to any access to the sequential (linear) file. Furthermore, utilizing the random access capability of disk storage devices, we could build an index to the linear file which would allow any item to be immediately called up and displayed. Another thought - why not build an index to the inverted file as well which would allow the indexing vocabulary to be displayed to assist the user in selecting terms. Finally, if the index would contain the word count, the user would have an immediate idea of the utility of the word as a search word.

How elegant!

It all came together one morning in January

of 1966; the preliminary design specification for DIALOG had been completed. Key System Development Milestones Between 1966 and 1970 there were five key events which provided the DIALOG project internal viability within Lockheed, and' external visibility around the world: •

NASA prototype and development contracts

COSATI panel demonstration

ESRO and AEC development contracts

ERIC services contract

ASIS exhibit

With the encouragement of Mel Day, we submitted a proposal to NASA in early 1966 for a prototype online retrieval service which resulted in the award of a $20K contract to install and operate a remote terminal at Ames Research center utilizing, DIALOG to access the NASA file of 260K citations.

DIALQG, first became operational on the file in November 1966

and the remote terminal, an IBM 2260 display terminal, was installed at NASA in April 1967. The controller for the display, incidentally, was too large to be transported up the staircase to the second story library and had to be installed by knocking out a window casing and raising the controller by crane. The installation cost nearly equalled the cost of the project itself. The results of this project are reported in Reference 3.

-14-


Online retrieval proved to be popular among the scientific staff at Ames. One of the few problems which arose came from a librarian who complained that there was so much demand for searching, she had been forced to forego a committee meeting and several coffee breaks to keep up with the backlog. The NASA/Ames prototype contract was a key event in that it not only established the viability of the concept with Lockheed management, but it proved that people would voluntarily use a terminal to communicate with a computer to retrieve information.

We learned that both librarians

and engineers could understand the use of Boolean operators (and, or, not) in searching a database.

Moreover, this contract gave the project the

exposure needed to attract future opportunities.

The NASA/RECON develop-

ment contract first suggested a realistic possibility of information retrieval as a formal line of business at Lockheed.

In 1968 COSATI (the Government Interagency Committee on Scientific and Technical Information) invited several online retrieval systems to demonstrate their capabilities on a file of project descriptors, and produced a film of the-demonstrations under the auspices of Battelle Memorial Institute. The COSATI demonstration could be likened to an invitational state-of-the-art conference.

The conference was attended by Lockheed, Mead, SDC, and Computer

Corporation of America. Only Mead and Lockheed demonstrated online retrieval systems. The SDC and CCA systems were forerunners of database management systems. While viewing, remember this was 1968 state-of-the-art.

It should

be noted the extent to which the approach has survived and has reappeared in other subsequently developed systems. Also note the foresight of the narrator at the end of the film in predicting international networks of computers and users. (COSATI FILM)

-15-


As a result of the COSATI demonstration in 1968, Harvey Marron of the U.S. Office of Education (USOE) became convinced that their ERIC file of educational research would be well-served by this new technology.

A series of

contracts initiated! services on this database to some six USOE-sponsored terminal sites between 1970 and 1972. These contracts marked the shift in our emphasis from that of systems developer to that of service vendor. The European Space Research Organization and Atomic Energy Commission software installation contracts were our final development contracts, and they reinforced our decision to shift to services in that they diverted significant amounts of human resources from database loading and system enhancements which were needed to support the services environment.

Online had in no way yet captured the imagination of the information community, however.

The 1969 ASIS Meeting in San Francisco included a special online

bazaar which was not nearly as interesting to attendees as Doug Englebart's word-processing system (Augumented Human Intellect) which was also demonstrated. It was"soon after this meeting that the first database supplier contract was struck for a commercial database called PANDEX.

I remember the negotiation -

Dick Kolin of Crowell,Collier and Macmillan wanted considerable up-front money and a percentage of gross for any use of PANDEX online - probably a result of his role as a New York publisher.

We finally settled for royalties

of something on the order of $10 per hour and 5* per offline print. Little would either of us know we were setting an industry contracting and pricing standard. Key Milestones in the 1970s In 1970 DIALOG began to provide a true online retrieval service. In addition to the ERIC centers at Stanford University and in Wash., D.C. (ERIC contained 12,000 citations at the time), we initiated an inhouse service at Lockheed on the AEC, NASA, and PANDEX files.

-16-


1970 included several other interesting events: •

First Computer Communications, Inc. (CCI) terminal arrived (heavy, but portable - allowed dialup)

ERIC was demonstrated at The White House Conference on Children (55K citations)

First transoceanic demonstration of online retrieval held (Paris to Palo Alto searching the Nuclear Science Abstracts database)

By most measures, 1970 was the year - one decade ago- that marked the true beginning of third party online retrieval service (i.e., service to organizations who were neither the supplier of the database nor the operator of the computer system).

Developments followed thick and fast during the

early 1970s.

1971 — •

Free-text or proximity searching was added to DIALOG

Systems Development Corporation (SDC) survey on the potential of online services was conducted

Two additional ERIC sites were added - C.E.C. and RISE

Council for Exceptional Children (C.E.C.) database was added to DIALOG

I remember my shock at seeing the SDC survey sent by Carlos Cuadra to some 8,000 organizations to inquire of their interest in online searching which databases, how much would they pay, etc. The "secret" of the potential fof;online searching was out, it appeared.

This survey served to convince

my management at Lockheed that competition was nipping at our heels and that we must redouble our efforts to maintain our position in the field. Ironically it turned out, unbeknownst to us at the time, that only 80 of the 8,000 questionnaires were ever returned, and that SDC almost decided not to enter the online arena as a result.

-17-


1972 •

Search-save feature was added to DIALOG

Dialup service initiated

CALSPAN and GE San Jose signed up for service

Competitive National Technical Information Service (NTIS) award was received for service to 5 terminals

A project report written by Bob Donati, then and still Manager of our New York Office, summarized 1972 achievements: 1 Jan. 1972

31 Dec. 1972

Number of terminals

6

16

Subscribing organizations

5

13

10

35

100K

900K

Daily search hours Records online

1973 •

Several databases were added: TRANDEX from C.C.M., Psychological Abstracts from the American Psychological Association, AGRICOLA (a name adopted much later) from the National Agricultural Library, and Science Abstracts from INSPEC

First advertisement was prepared - 15 exposures during 1973

First Users' Meeting held in our New York Office

Washington Office opened with Rick Caputo

(SLIDE 7)

The most significant development in 1973, however, was the inter-connection of DIALOG and the TYMSHARE communications network in 1973. This connection provided a means for potential customers in over 40 U.S. cities to access DIALOG through a local telephone call at a flat charge of $10.00 per hour for telecommunications.

The significance of such a service is that it

ultimately has allowed the concentration of world-wide demand at a single processing center.

Such a concentration provides the rationale for the

offering of many small and specialized databases which otherwise would not be viable. This TYMSHAEE service must be noted as one of the critical . occurrences in the evolution of online searching.

•18-


1974 1974 saw a continuing buildup of databases with the addition of Predicasts, ISI, IFI/PLENUM, and Chemical Abstracts databases. The final event of importance in 1974 was the award of a grant from the National Science Foundation to study the utility of online searching in the public library.

The significance of this grant was that it provided a

potential avenue for use of online retrieval services by the general public. (Incidentally,

we still have copies of the final report of this study

available at no charge if you will contact me or give me your card.) Present Service From its humble beginning on the IBM 360/30 computer, the service has grown until it now is operated by two large-scale IBM computers with access to over 150 disk drives.

For those of you who have not had a change to

visit us in Palo Alto, the next several slides provide a quick tour of our Palo Alto facility. FACILITY SLIDES •

BUILDING

CUSTOMER SERVICES

TRAINING

PUBLICATIONS

ACCOUNTING

PROGRAMMING

COMPUTERS

DISK STORAGE DATABASE SLIDES CUSTOMER PROFILE COUNTRY PROFILE GROWTH SUMMARY

-20-


INFORMATION RETRIEVAL SERVICE

\

O

\\ File No.

\

0 » 5»

Or \\

\ DATABASE (Supplier)

1^

\

O >*

'

-£• \ \

\\

% \

\

$90 90 55 75 15 55 60 60 35 60 45 60 75 95 n/a 60 90

15* 15 12 15 n/a 15 30 30 10 30 10 30 10 10 n/a 15 20

$75 60 90 65 90 90

20* 20 $3 20 50 50

45 90 70 90 90 90 90 90 90 90 90 90 90 45 45 45

25 20 15 20 20 20 20 20 20 20 20 20 20 25 50 25

\ \

MULTIDISCIPLINARY 102 101 35 77 200 114 26 27 66 85 47 78 111 211 911 49 65

*

ASI (Congressional Information Service, Inc.) CIS/INDEX (Congressional Information Service, Inc.) COMPREHENSIVE DISSERTATION ABS. (Univ. Microfilms Inc.) CONFERENCE PAPERS INDEX (Data Courier, Inc.) DIALOG PUBLICATIONS (DIALOG Information Retrieval Service ENCYCLOPEDIA OF ASSOCIATIONS (Gale Research Company) FOUNDATION DIRECTORY (The Foundation Center) FOUNDATION GRANTS INDEX (The Foundation Center) GPO MONTHLY CATALOG (U.S. Government Printing Office) GRANTS DATABASE (Oryx Press) MAGAZINE INDEX (Information Access Corp.) NATIONAL FOUNDATIONS (The Foundation Center) NATIONAL NEWSPAPER INDEX (Information Access Corporation) NEWSEARCH (Information Access Corporation) NEWSEARCH (Information Access Corporation) PAIS INTERNATIONAL (Public Affairs Information Service, Inc.) SSIE CURRENT RESEARCH (Smithsonian Science Info. Exchange)

BUSINESS/ECONOMICS 15 19 100 90 22 92 105 59 75 42 20 98 18 84 83 17 16 81 82 106 107 126

ABI/INFORM (Data Courier, Inc.) CHEMICAL INDUSTRY NOTES (American Chemical Society) DISCLOSURE (Disclosure Incorporated) ECONOMICS ABSTRACTS INTERNATIONAL (Learned Information Ltd.) EIS INDUSTRIAL PLANTS (Economic Information Systems, Inc.) EIS NONMANUFACTURING ESTABLISHMENTS (Economic Information Systems, Inc.) FOREIGN TRADERS INDEX (U.S. Department of Commerce) FROST & SULLIVAN DM 2 (Frost & Sullivan) MANAGEMENT CONTENTS® (Management Contents, Inc.) PHARMACEUTICAL NEWS INDEX (Data Courier, Inc.) PTS FEDERAL INDEX (Predicasts, Inc.) PTS F&S INDEXES 1972-1975 (Predicasts, Inc.)* PTS F&S INDEXES 1976-present (Predicasts, Inc.)* PTS INTERNATIONAL TIME SERIES (Predicasts, Inc.)* PTS INTERNATIONAL FORECASTS (Predicasts, Inc.)* PTS PREDALERT (Predicasts, Inc.)* PTS PROMT (Predicasts, Inc.)* PTS U.S. FORECASTS (Predicasts, Inc.)* PTS U.S. TIME SERIES (Predicasts, Inc.)* TRADE OPPORTUNITIES (U.S. Department of Commerce) TRADE OPPORTUNITIES WEEKLY (U.S. Department of Commerce) U.S. EXPORTS (U.S. DeDartment of Commerce)

* Not Yet Available]

-21-

i


<&U IflLOG INFORMATION RETRIEVAL SERVICE

9. • 3> o **.

2 * ^ <» -A

File No.

<J,

DATABASE (Supplier) APPLIED SCIENCE & TECHNOLOGY

45 44 112 116 96

23 223 224 225 124 24 25

125 8 60 69 40 68 51 79 123 74 14 32 118 6 28 48 41 95 115 63 99 33 67

APTIC (Air Pollution Tech. Info. Ctr. & the Franklin Institute) AQUATIC SCIENCE & FISHERIES ABSTRACTS (NOAA) AQUACULTURE (National Oceanic and Atmospheric Administration) AQUALINE (Water Research Centre) BHRA FLUID ENGINEERING (British Hydromechanics Research Association) CLAIMS™/CHEM 1950-1970 (IFI/Plenum Data Company) CLAIMS™/CHEM/UNITERM 1950-1970 (JFI/Plenum Data Company) CLAIMS™/CHEM/UNITERM 1971-1977 (IFI/Plenum Data Company) CLAIMS™/CHEM/UNITERM 1978.- present (IFI/Plenum Data Company) CLAIMS™/CLASS (IFI/Plenum Data Company) CLAIMS™/U.S. PATENTS 1971-1977 (IFI/Plenum Data Company) CLAIMS™/U.S. PATENT ABSTRACTS 1978-present (IFI/Plenum Data Company) CLAIMS™/U.S. PATENT ABSTRACTS WEEKLY (IFI/Plenum Data Company) COMPENDEX (Engineering Index, Inc.) CRIS USDA (USDA) ENERGYLINE® (Environment Information Center, Inc.) ENVIROLINE® (Environment Information Center, Inc.) ENVIRONMENTAL BIBLIOGRAPHY (Internatl. Acad, at Santa Barbara) FOOD SCIENCE AND TECHNOLOGY ABS. (Intl. Food Info. Service) FOODS ADLIBRA (K&M Publications. Inc.) INPADOC (International Patent Documentation Center) INTERNATIONAL PHARMACEUTICAL ABS. (Am. Soc. of Hospital Pharmacists) ISMEC (Data Courier, Inc.) METADEX (American Society for Metals) t NON-FERROUS METALS ABSTRACTS (British Non-Ferrous Metals Technology Center) NTIS (National Technical Info. Service, U.S. Dept. of Commerce) OCEANIC ABSTRACTS (Data Courier, Inc.) PIRA (Research Assoc, for Paper & Board. Printing & Packaging Indus.) POLLUTION ABSTRACTS (Data Courier, Inc.) RAPRA ABSTRACTS (Rubber amd Plastics Research Association of Great Britain) SURFACE COATINGS ABSTRACTS (Paint Research Association of Great Britain) TRIS (U.S. Department of Transportation and Transportation Research Board) WELDASEARCH (The Welding Institute) WORLD ALUMINUM ABSTRACTS (American Society for Metals) WORLD TEXTILES (Shirley Institute)

• Not Yet Available -22-

$35 35 35 35 65

10* 15 15 30 15

95 300 300 300 90 95 95

15 15 15 15 10 15 50

$95 65 40 90 90 60 65 55 95 50

50* 15 10 20 20 15 15 10 20 15

75 80 45

15 12 20

35 75 55 75 65

10 15 15 15 15

65

15

40

10

65 50 55

15 10 10


\

I f l

^DIALOG INFORMATION RETRIEVAL SERVICE

A

O

\

O

10 55 5 2 . 3 4 31 30 131 50 72 73 58 12 13 76 204 231 94 94 34 34 62 52

\

C

File No.

no

**% ^ ^

\

\ <c \ \ t - \ \

\

DATABASE (Supplier)

"\

\

\

\

«

\ *

SCIENCE AGRICOLA 1970-1978 (U.S.D.A. Technical Information Systems) AGRICOLA 1979-present (U.S.D.A. Technical Information Systems) BIOSIS PREVIEWS 1969-1973 (Biosciences Information Service) BIOSIS PREVIEWS 1974-present (Biosciences Information Service) CA SEARCH 1967-1971 (American Chemical Society) CA SEARCH 1972-1976 (American Chemical Society) CA SEARCH 1977-present (American Chemical Society) CHEMNAME™ CHEMSEARCH™(Chemical Abstracts Service, DIALOG Information Retrieval Service) CHEMSIS™(Chemical Abstracts Service, DIALOG Information Retrieval Service) , CAB ABSTRACTS (Commonwealth Agricultural Bureaux) EXCERPTA MEDICA (Excerpta Medica) EXCERPTA MEDICA IN PROCESS (Excerpta Medica) GEOARCHIVE (Geosystems) INSPEC 1969-1977 (Institution of Electrical Engineers) INSPEC 1978-present (Institution of Electrical Engineers) IRL LIFE SCIENCES COLLECTION (Information Retrieval Ltd.) ONTAP™ CA SEARCH (American Chemical Society) ONTAP™ CHEMNAME (American Chemical Society) SCISEARCH® 1974-1977 (Institute for Scientific Information) subscriber SCISEARCH® 1974-1977 (Institute for Scientific Information) nonsubscriber SCISEARCH® 1978-present (Institute for Scientific Information) subscriber SCISEARCH® 1978-present (Institute for Scientific Information) nonsubscriber SPIN (American Institute of Physics) TSCA INITIAL INVENTORY (Environmental Protection Agency, DIALOG Information Retrieval Service)

-23-

$25 25 $45 45 70 70 70 70 55

5* 5 10* 10 20 20 20 20 16

70

20

35

25

65 65 70' 55 55 45 15 15 40 130 30 120

25 25 20 15 15 15 n/a n/a 10 20 10 20

35 45

10 15

\


\

WDimiOG INFORMATION RETRIEVAL SERVICE

\

•$•

\°o. \\ % ^ -si-• \ \

\

v» \ \ 4 * **

\

O

\ File No.

^

DATABASE (Supplier)

r

"

\

\

9

% \

\

Cv

\

SOCIAL SCIENCES & HUMANITIES

9 38 56 64 1 54 39 36 61 71 21 46 70 86 201 57 91 11 97 7 37 93 120

AIM/ARM (Center for Vocational Education) AMERICA: HISTORY & LIFE (ABC-Clio, Inc.) ART MODERN (ABC-Clio, Inc.) CHILD ABUSE AND NEGLECT (Natl. Cntr. for Child Abuse and Neglect) ERIC (Educational Resources Information Center) EXCEPTIONAL CHILD ED. RESOURCES (Council for Except. Children) HISTORICAL ABSTRACTS (ABC-Clio, Inc.) LANGUAGE & LANGUAGE BEHAVIOR ABS. (Sociol. Abs., Inc.) LISA (Learned Information Ltd.) MLA BIBLIOGRAPHY (Modern Language Association) NCJRS (National Criminal Justice Reference Service) NICEM (National Information Center for Educational Media) NICSEM/NIMIS (National Info. Cntr. for Special Education Materials) • NIMH (National Clearinghouse for Mental Health Information. National Institute of Mental Health) ONTAP™ ERIC PHILOSOPHER'S INDEX (Philosophy Documentation Center) POPULATION BIBLIOGRAPHY (University of North Carolina, Carolina Population Center) PSYCHOLOGICAL ABSTRACTS (American Psychological Assoc.) RILM ABSTRACTS (City University of New York, International RILM Center) SOCIAL SCISEARCH® (Institute for Scientific Information) SOCIOLOGICAL ABSTRACTS (Sociological Abstracts, Inc.) U.S. POLITCAL SCIENCE DOCUMENTS (Univ. of Pittsburgh, Cntr. for International Studies) U.S. Public School Directory (National Center for Educational Statistics)

•k Not yet available!

-24-

$25 65 60 35 25 25 65 55 50 55 35 70 35 $30

10? 15 15 10 10 10 15 15 10 15 15 20 10 10*

15 55 55

n/a 15 10

65 65

10 15

70 55 65

10 15 15

35

10


TlfAiriG'aiSTQMERS

BUSINESS & INDUSTRIAL FIRMS UNIVERSITY & COLLEGE LIBRARIES GOVERNMENTAL AGENCIES SCHOOL SYSTEMS PROFESSIONAL & NONPROFIT ASSNS INDIVIDUAL PROFESSIONALS PUBLIC LIBRARIES & CONSORTIA

-25-


INFORMATION RETRIEVAL SERVICE 3460 Hillview Avenue Palo Alio. CA 94304 (415)858-2700 TELEX 334499

FOREIGN CUSTOMERS OF DIALOG INFORMATION SERVICES ARABIAN GULF

FINLAND

LEEWARD ISLANDS

ARGENTINA

FRANCE

LUXEMBOURG

AUSTRIA

GERMANY

MALAYSIA

AUSTRALIA

HUNGARY

MEXICO

BAHRAIN

GUADALUPE

NETHERLANDS

BELGIUM

GUATEMALA

NEW ZEALAND

BERMUDA

HONG KONG

NICARAGUA

BRAZIL

INDIA

NORWAY

BR. WEST INDIES

IRAQ

PHILIPPINES

CANADA

IRELAND

PORTUGAL

CHILE

ISRAEL

QUATAR

REPUBLIC OF CHINA

ITALY

PUERTO RICO

COSTA RICA

JAMAICA

SAUDI ARABIA

COLUMBIA

JAPAN

SCOTLAND

DENMARK

KENYA

SINGAPORE

ECUADOR

SO. KOREA

SOUTH AFRICA

EGYPT

KUWAIT

SPAIN

LOCKHEED INFORMATION SYSTEMS/LOCKHEED MISSILES ft SPACE COMPANY. M C

-26-

FEB 2 S JJJgj


d INFORMATION RETRIEVAL SERVICE 3460 Hillviaw Avenu* Palo Alto. CA 94304 (415) 858-2700 TELEX 334499

SWEDEN

SWITZERLAND

SYRIA

TAIWAN

TANZANIA

TASMANIA

THAILAND

r

TRINIDAD

UNITED KINGDOM

VENEZUELA

WALES

WEST INDIES

YUGOSLAVIA

«•> *-•"-'**^v;r*V~» —>":'"•*

-26A-

" "»».• iff !*•*»• •?ST^-""» -C'! >~"*± ..ZZ-) »*•" "'


#bir^joo

DIALOG STATISTICAL HIGHLIGHTS

1973

1978

1980

6

80

110

1.0

21.0

40.0

NUMBER OF DATABASES _'_ MILLIONS CITATIONS/ABSTS COMPUTERS IN SERVICE

360/40

370/165 3032 .

BYTES OF STORACE (BILLIONS) NUMBER OF CUSTOMERS NUMBER OF COUNTRIES

1

1.0 80 1

20,0 4,000 + 32

1980 V S . 1973 18.3 x ,

3033 3032

40.0 x -

40.0

40.0 x

10,000 +

125.0 x

47

47 x


Present and Future Although online retrieval and online cataloging are probably the most familiar applications of

computer/communications technology to most of

us, the whole area is alive with hardware developers, systems innovators, product entrepeneurs and capital.

As Alan Brigish, Editor of VideoPrint

Newsletter observed, the WUMPUS is awake.

(Explain WUMPUS.)

We seem to have reached that critical mass in hardware, software, market awareness, and economics that stimulate heavy venture capital activities. Take, for example, the following recent events: •

Chemical Abstracts Service, long a supplier of databases, has entered the online business with CAS ONLINE (a chemical substructures search service)

Information Handling Service, long a publisher of specifications and standards on microfilm, has acquired Bibliographic Retrieval Services (an information retrieval service), and PREDICASTS, a database supplier

Radio Suisse, a Swiss communications company (like AT&T) began a joint venture with BRS and PREDICASTS

McGraw Hill, a long-time publisher acquired Data Resources, Inc. (DRI), an econometric service

Other traditional publishers such as Ziff-Davis, Pergamon, and Elsevier have acquired databases and/or service bureaus

The SOURCE - a service for personal computer users was recently bailed out of bankrupcy through purchase by the Reader's Digest

EURONET, a Common Market telecommunications combine was put into service to provide cheap telecommunications rates for European-based information retrieval services

VIEWDATA, a technique for simple interaction was developed by The British Post Office and is receiving much attention in various guises around the world

MEAD has introduced NEXIS - a service of full-text news and magazine article retrieval

-28-


I could go on, but this should provide a flavor of the frantic activity in process to find a marketable combination of information content and delivery technique. With the combination of computer storage and processing coupled with telecommunications, we face a new milestone in the evolution of civilization which, to my mind, is equal in significance to the development of a written language and the invention of the printing press.

What good, after all, is

the recording of knowledge if one cannot identify the existance or source of needed information at the time of need?

Written language allowed society

to record and cumulate knowledge as opposed to each generation having to spend mast of its time and energy in relearning.

The printing press popularized

knowledge and led to public education thus enormously increasing the numbers of potential contributors to technological progress.

This proliferation of

knowledge led to the information explosion which now inundates us with so much information that significant items are often obscured.

The development

of computer-readable language and communications allows us to again become selective in retrieval and thus be freed of the burden of over-specialization necessitated by information overload. But today's developments have even greater significance.

Today's microprocessor

technology is to the computer industry what movable type was to the printing industry.

We can anticipate analogous impacts on society.

As the information specialists in our society, you have the opportunity and even the obligation to be aware of this rapidly changing scene for you are best able to select from this nutritious soup technology is providing, those services, materials and building blocks most appropriate for information transfer.

Information retrieval has had a profound effect on the profession

of librarianship, but it is only the beginning.

The library of 1985 will

become a virtual information system which can provide low-cost access to and on-demand delivery of the world's accumulated store of knowledge.

Effective

utilization of this potential requires an already emerging library professional who is not only versed in information sources and delivery techniques, but also possesses entrepreneurial

qualities of marketing and resource management

necessary for the successful operation of a large-scale service enterprise.

-29-


COMMUNICATION LEVEL •

KNOWLEDGE TRANSFER

NON-VERBAL

LITTLE TRANSFER

VERBAL LANGUAGE

MYTHS, STORY TELLING BEGINNINGS OF EDUCATION

WRITTEN

BOOKS AND ARCHIVES LIBRARIES

PRINTING PRESS

POPULARIZATION OF KNOWLEDGE BEGINNING OF INFORMATION EXPLOSION INCENTIVE TOWARD SPECIALIZATION

COMPUTER READABLE LANGUAGE

AUTOMATIC RETRIEVAL BEGINNINGS OF A KNOWLEDGE MACHINE HIGH TRANSFER EFFICIENCY

o i

LANGUAGE


REFERENCES

(1)

King, Gilbert W., Automation and the Library of Congress, Library of Congress; Washington, D. C , 1963

(2)

"An On-Line Technical Library Reference Retrieval System," American Documentation, January 1966

(3)

Summit, Roger K., Remote Information Retrieval Facility, National Aeronautics and Space Administration (NASA CR-1318), Washington, D. C., April 1969

-31-


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.