Information retrieval

Page 1

Information Retrieval Search Engines Application of softwares in Veterinary Field AGB 111 2013-14


Information Retrieval (IR)

06/12/13

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


What is information retrieval • Gathering information from a source(s) based on an information need usually from a query • Sources of information – – – – –

Other people Archived information (libraries, maps, etc.) Radio, TV, etc. Web (search engines) Nature Information retrieval is more than just web search

06/12/13 3

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Information retrieval vs ? • Information retrieval (IR) is the activity or process of obtaining information resources relevant to an information need from a collection of information resources. • Data mining is the process that attempts to discover patterns in large data sets. • Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents 06/12/13 4

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Data, Information, Knowledge • Data - Facts, observations, or perceptions. Eg. Telephone number • Information - Subset of data, only including those data that possess context, relevance, and purpose. Eg. A Phone book • Knowledge - A more simplistic view considers knowledge as being at the highest level in a hierarchy with data (at the lowest level) and information (at the middle level). Eg. Recognzing the Phone numebr of the person whom you want to contact 06/12/13 5

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Ideal Information Retrieval • Ask someone • Search – Search for someone to ask – Search for needed information - library – Use a search engine • Process of IR - queries or questions

06/12/13 6

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Information to be retrieved • Permanent vs Impermanent information – Conversation, events – Documents (in a general sense) • Text, tweets • Video • Files • Pictures • Data

06/12/13 7

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


How good is the IR system Measures of performance based on what the system returns: • Relevance • Coverage • Recency • Functionality (e.g. query syntax) • Speed • Availability • Usability • Time/ability to satisfy user requests

06/12/13

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore

8


How IR systems work Algorithms implemented in software • Gathering of information • Storage of information • Indexing • Interaction • Evaluation

06/12/13 9

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Information Seeking Behavior

• Two parts of the process:

–search and retrieval –analysis and synthesis of search results

06/12/13 10

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


User’s Information Need

Collections Pre-process

text input Parse

Query

Index

Rank or Match

Query Reformulation 06/12/13 12

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Vannevar Bush - Memex - 1945 "A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory.�

13


IR History • • • • • •

Interest in computer-based IR from mid 1950’s Key word indexing H.P. Luhn at IBM (1958) Probabilistic models at Rand (Maron & Kuhns) (1960) Boolean system development at Lockheed (‘60s) Vector Space Model (Salton at Cornell 1965) Statistical Weighting methods and theoretical advances (‘70s) • Refinements and Advances in application (‘80s) • User Interfaces, Large-scale testing and application (‘90s)

• Then came the web and search engines and everything changed 06/12/13 14

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Search engines

06/12/13 15

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Definition • Web sites that search for documents based on keywords. • Search engines typically search millions of web sites for text in web pages.

06/12/13 16

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


• Search engines use "spiders" which search the web for information. • They are software programs (Computer Robots) which request pages. • In addition to reading the contents of pages for indexing, spiders also record links.

06/12/13 17

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Search engine operates in the following order • Web crawling- helps in retrieving stored information • Indexing-The contents of each page are then analyzed for example, words are extracted from the titles, headings, or special fields called meta tags. • Searching-Based on the selection of key words. 06/12/13 18

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Query Engine

Index

Interface Indexer Users

Crawler

Web 06/12/13 19

A Typical Web Search Engine Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


06/12/13 20

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Types of search Engine • Individual search engine: It has its own databases of websites and query methods eg. Altavista, Looksmart, Ask,Hotbot etc., • Meta Search Engine: It queries several individual search engines simultaneously and then amalgamates the results eg., Metacrawler, Dogphile, Mama etc.,

06/12/13 21

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Procedure for online searching • Log on-User connects the terminal -identified as legitimate user • Search- Involves selection of search engine and selection of keywords or desired terms. • Results- User request the result in a chosen format as pre-formatted fields or paragraph either online or offline. • Log off –The searcher instructs the system to discontinue. 06/12/13 22

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Search Mechanism • Boolean- Use ‘And,’ ‘or’ and ‘not’ to connect words or phrases in the search statement • Plus/Minus- Cloning+Goat • Phrases- enclosed in Double quotes- “Gregor Mendel” • Proximity operators- to specify how close the terms should be-Heat()shock()Proline

06/12/13 23

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Search Mechanism(Contd.) • Nesting- Allows to build complex queries -Structure AND(DNA OR RNA) • Wild card or Truncation Operators- This allows all terms beginning with the same letters to be retrieved- Phyloge? Retrieves Phylogeny, phylogenics, phylogenic tree. etc 06/12/13 24

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


General Tips for Search Strategy Be specific Use + symbol to narrow down search Use phrase searching “” Truncate words to pick singular and plural versions? • Use synonyms via Or operator • Combine key words into phrases wherever possible eg., “data” “population” • Try alternate spelling • Type your key word in lower case letters • Be persistent and creative!!!! Dr R Jayashree, Asst. Professor(AGB) 06/12/13 25 • • • •

Veterinary College, Bangalore


Offline Information gateway • Information which is already in the form of CD-ROM,Book, Journals etc., Some of the Offline information CD- ROM • ASFA- Aquatic Sciences and Fisheries Abstracts • BEASTCD- Animal production database • BIOSIS-Biological abstracts • VetCD- Veterinary Science database 06/12/13 26

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Libraries on computer disk (Literature search through CD-ROM)

Computers have provided a great facility to research workers and scientists- short time review the literature available in the world. • E journals or any other journal can be accessed • CD’s containing abstracts on various topics on Veterinary Science published in a number of very important journals are now available in the market.  By subscribing to those CD’s and by using literature search software, we will be able to use the vast amount of information stored in the CD.

06/12/13 27

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Computer Applications in Veterinary and Animal Husbandry practices • • • •

Educational and Information purposes Epidemiology Clinicopathology Animal genetics and Breeding

06/12/13 28

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Various resources • Computer-aided learning in Veterinary education(CLIVE)- learn various aspects of veterinary science • International Veterinary Information centre(IVIC)- Network of computers in the field • Vetstream- species based CD-ROMs • VetWeb- WWW for Veterinary users • Episcope- For epidemiological principles and calculations

06/12/13 29

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


• Computer system for recording events economically important livestock(COSREEL) • ECG diagnostic system to analyze a patient automatically • CARDIO diagnostic system for Veterinarian • HEMO clinic pathological system • Veterinary Investigation Recording User System (VIRUS)- analyzing the individual cow records • Breeding data analysis software • Disease investigation software 06/12/13 30

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


• Animal Husbandry data analysis • Veterinary Practice management software • Computerized herd health information management • Computer packages- LSMLMW, MATLAB,MICROSTAT for analysis of animal breeding and performance data

06/12/13 31

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Softwares for statistical analysis of animal breeding data • ASReml is a statistical software package or fitting linear mixed models using restricted maximum likelihood • GenStat is a general statistical package • GraphPad Prism is a commercial scientific 2D graphing and statistics software. • SPSS Statistics is a for statistical analysis • SAS is a software suite developed for advance analytics, business intelligence, data management and predictive analysis 06/12/13 32

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Debugging Debugging is the process of locating and fixing or bypassing bugs (errors) in computer program code or the engineering of a hardware device. To debug a program or hardware device is to start with a problem, isolate the source of the problem, and then fix it. Debugging tools (called debuggers) help identify coding errors at various development stages. Some programming language packages include a facility for checking the code for errors as it is being written 06/12/13 33

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Creating Algorithm in Programming

• Algorithm are the set of well defined instruction in sequence to solve a program. • An algorithm should always have a clear stopping point.

06/12/13 34

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


•Inputs and outputs should be defined precisely. •Each steps in algorithm should be clear and unambiguous. •Algorithm should be most effective among many different ways to solve a problem. •An algorithm shouldn't have computer code. Instead, the algorithm should be written in such a way that, it can be used in similar programming languages.

06/12/13 35

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


For AGB 111 course refer • http://issuu.com/animalgenetics • http://www.agb111.blogspot.in/

Thank You 06/12/13 36

Dr R Jayashree, Asst. Professor(AGB) Veterinary College, Bangalore


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.