WAVe, Semantic Web and The Meaning of Life

Page 1

WAVe Semantic Web The Meaning of Life

No pro mises this on on e!!

PEDRO LOPES pedrolopes@ua.pt University of Leicester October 8th, 2010


PAST

PRESENT

It’ s

a“

flu Yes! xc ap ac

ito

FUTURE

SemanticWeb

r”!


DYNAMIC, WEB-BASED, WORKFLOW SERVICE COMPOSITION


http://bioinformatics.ua.pt/diseasecard

‣ FUTURE • Enhance User Experience • Improve Performance • Improve Structure & Organization • Add Semantic Layer

BRIDGING THE GAP BETWEEN GENOMICS AND MEDICINE


PAST

PRESENT

FUTURE

SemanticWeb



WHAT IS WAVe?


WEB ANALYSIS OF THE VARIOME

?

Enable agile access to integrated & enriched human variome research datasets

!

Genes * [LSDBs + Variants + Original Resources]

â˜ş

An extensible lightweight integration & enrichment platform for genomic variation datasets


http://bioinformatics.ua.pt/WAVe


HIGHLIGHT | RESOURCES ‣ LSDB • LOVD + MUTbase + UMD + misc legacy

‣ GENE • GeneCards + GeneNames + Entrez

‣ PHARMACOGENOMICS • PharmGKB

‣ LOCUS • MapViewer + Ensembl

‣ PUBLICATION

‣ PATHWAY

‣ DISEASE

‣ PROTEIN

~ 1350 Genes, 1550 LSDBs, 80k Variants, 100k Links ! • QuExT • KEGG + Reactome • OMIM

• UniProt + PDB + Expasy + InterPro

‣ GENE ONTOLOGY • AmiGO


HIGHLIGHT | FEATURES ‣ GENE SEARCH • Direct access to genes ‣ Auto-suggest engine

‣ API • RSS/XML access to data ‣ Usable in any framework • Genes

‣ GENE ANALYSIS WORKSPACE • Navigation tree ‣ Holistic perspective on all data • “Live view” mode ‣ Shows original applications/content

‣ Access navigation tree data ‣ Google Chrome Extension • Variants ‣ Only platform that aggregates variants from multiple sources


FUTURE PERSPECTIVES ‣ USER INTERACTION • Global search + augmented browsing + tabbed browsing + custom profiles

‣ RESOURCES • All genes • Café RouGE + LOVD-worldwide • dbSNP + HapMap + 1000 Genomes • LRG...

‣ SEMANTIC WEB • Meaningful relationships ‣ Gene to Protein ≠ Gene to Variant ≠ Gene to Pathway ≠ ...


PAST

PRESENT

FUTURE

SemanticWeb


HYPE


WHAT? ‣ Semantic Web is the Web of Knowledge ‣ It is about standards for publishing, sharing and querying knowledge drawn from distributed and heterogeneous resources ‣ It enables the answering of sophisticated questions

OK... BUT WHAT DO WE NEED TO DO?


FREE TEXT

The Eiffel Tower (French: La Tour Eiffel, [tuʁ ɛfɛl], nickname La dame de fer, the iron lady) is an 1889 iron lattice tower located on the Champ de Mars in Paris that has become both a global icon of France and one of the most recognizable structures in the world. The tallest building in Paris,[10] it is the most-visited paid monument i the world; millions of people ascend it every year. Named for its designer, engineer Gustave Eiffel, the tower was built as the entrance arch t the 1889 World's Fair. The tower stands 324 metres (1,063 ft) tall, abo the same height as an 81-storey building. It was the tallest man-made structure in the world from its completion until the Chrysler Building in New York City was built in 1930. Not including broadcast antennas, it is the second-tallest structure in France after the 2004 Millau Viaduc The tower has three levels for visitors. Tickets can be purchased to ascend, by stairs or lift, to the first and second levels. The walk to the first level is over 300 steps, as is the walk from the first to the second level. The third and highest level is accessible only by elevator. Both the firs and second levels feature restaurants.


STRUCTURED TEXT

Name: Eiffel Tower, La Tour Eiffel Location: Paris, France, Architect: Stephen Sauvestre Height: 324m ...


RELATIONAL MODEL

NAME

LOCATION

HEIGHT

Eiffel Tower

Paris, France

324m

...

...

...


SEMANTIC WEB

324 m

La Tour Eiffel sameAs

hasHeight

Eiffel Tower isLocatedAt hasArchitect Paris, France Stephen Sauvestre


EXPRESSING MEANING ‣ TRIPLES • Everything (really everything!) can be described as a statement based on a triple (or combination of statements)

‣ EXAMPLES • Liverpool is a sport club • James Cameron directed Avatar • Protein P05067 is located in Membrane

‣ SUBJECT PREDICATE OBJECT • Building and connecting statements creates knowledge


ENABLING KNOWLEDGE

Amyloid precursor protein

Alzheimer

label

uniprot:P05067 is a

Protein

label

involved

omim:104300 is a

Disease



OWL ‣ DEFINE RELATIONS ‣ Web Ontology Language • Define complex concept environments • Individual + Property assertion = Axiom • “Object-Oriented” ‣ Classes ‣ Properties ‣ Instances

FOAF Friend-Of-A-Friend


RDF

‣ STATEMENT STORAGE ‣ Resource Description Framework • Store data as triples ‣ File formats • RDF/XML • N3 • Turtle

‣ Relational database • Quite heavy and not easy to deal with ‣ Text files must be read (and parsed) (and cached)


SPARQL ‣ ASK QUESTIONS ‣ SPARQL Protocol and RDF Query Language • Query data stored in RDF • SQL’s “younger brother” • Features ‣ Ambiguous ‣ Multiple variables


SEMANTIC WEB RICHNESS ‣ CLIENT SIDE • User Interfaces

‣ SERVER SIDE • Ontology

‣ Semantically rich applications

• Semantically rich resources

‣ Meaningful results

• Meaningful relationships

‣ Context

• Reasoning

‣ Enrich text

• Context-aware • Artificial Intelligence

• Information Visualization

• Augmented browsing

• Linked Data • Intelligent resource networks

From server side semantic richness to client side interfaces

?



CLIENT SIDE Cardiac ... ECG

From simple result listings to semantically rich interfaces

!


DEMO SEMANTICALLY RICH INTERFACE


SERVER SIDE

Composition Tim Berners-Lee DBPedia

RDF OWL

Federation Endpoint

SPARQL

Query

Knowledge

FOAF

SADI

Integration Identity Triplestore Ontology

Mashup

Linked Data

Mapping XML

D2R Text

Network

Bio2RDF


FEDERATED QUERYING ‣ ONE QUERY, MULTIPLE INSTANCES • Connect distinct resources ‣ Cross information

1

‣ Merge datasets

2

‣ CHALLENGES • How to query so many distinct resources?

3

• How to map results?

‣ SOLUTIONS • SPARQL querying

...

• Ontology mapping

n


FEDERATED QUERYING IN GEN2PHEN ‣ MULTIPLE LSDBs

‣ MULTIPLE MOLGENIS

• Get data from distinct LOVD instances

CHINA

• Connect data models distributed in multiple MOLGENIS instances

PHENO

AUSTRALIA

VARIO

FRANCE

PAGE

... UK

...

HGVbaseG2P


ADVANTAGES ‣ DATA ACCESS • Direct ‣ No need for wrappers or mediators ‣ No need for data mappings or transformations • Homogeneous ‣ Results are retrieved as XML/JSON • Coherent • Easy to parse/browse • Client-side ready

SWAT4LS?

‣ DATA MODELS • Semantic, not relational ‣ Ontology ‣ No need for direct connections • INNER JOIN

• Reasoning ‣ Ask questions ‣ Process answers


DEMO FEDERATED QUERIES


THE MEANING OF LIFE

‣ COEUS • One of the 12 Titans ‣ Greek deities • Titan of Wisdom, Intelligence, Knowledge

THE CORE OF MY PHD WORK


SEMANTIC CONCEPT MANAGEMENT FRAMEWORK

Oral ...

DC4

Marker

WAVe

COEUS


http://bioinformatics.ua.pt


ME pedrolopes@ua.pt (or pl97@le.ac.uk)

http://pedrolopes.net

/pedrolopes

/pdrlps


QUESTIONS? THANK YOU!


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.