Service Composition for Biomedical Applications

Page 1

SERVICE COMPOSITION FOR BIOMEDICAL APPLICATIONS Pedro Lopes pedrolopes@ua.pt

supervisor José Luís Guimarães Oliveira universidade de aveiro

doctoral programme in informatics engineering october 1st, 2012

jury Artur Manuel Soares da Silva universidade de aveiro Víctor Maojo García universidade politécnica de madrid Rui Pedro Sanches de Castro Lopes escola superior de tecnologia e gestão do instituto politécnico de bragança Francisco José Moreira Couto universidade de lisboa Carlos Manuel Azevedo Costa universidade de aveiro


software engineering

SERVICE COMPOSITION FOR BIOMEDICAL APPLICATIONS bioinformatics & computational biology


DECREASED DISEASE RISK


ELEVATED DISEASE RISK


DATA EVOLUTION

1500 1330 1250

1170

1380

1230

1078 968

1000

858 719

750 548 500 250 0

162

2004

171

2005

139

110

110

95

58

2006

2007

2008

2009

2010

New

96

92

2011

2012

Total

NAR database list evolution, from 2004 to 2012


DATA EVOLUTION

more data

bioinformatics requirements

more tools

computer science developments

innovation


DATA EVOLUTION

innovation

new software and hardware

more data

bioinformatics requirements

more tools


DATA EVOLUTION

more tools innovation more data


DATA EVOLUTION

more data more tools innovation


NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS

more data more tools innovation


NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS

workflow-based strategies for service composition resource integration approaches for service composition enhancing service composition with the semantic web and rad


NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS

workflow-based strategies for service composition resource integration approaches for service composition enhancing service composition with the semantic web and rad


SCIENTIFIC WORKFLOWS


SCIENTIFIC WORKFLOWS


SCIENTIFIC WORKFLOWS data in integration perspective distributed composition control

data out result analysis knowledge exchange


INTEROPERABILITY CHALLENGE scientific workflows workflow & service interoperability parallelization, security, integration

activity execution combine and evaluate multiple independent activities deliver service & workflow execution in real-time web-based environment

data sharing knowledge exchange semantics collaborative and reproducible research

previous work dynamicflow taverna is the de facto workbench for scientific workflows

deploy a scalable architecture to enable service composition between multiple data & services providers

?


WORKFLOW COMPOSITION ARCHITECTURE API 5 CLIENT APPLICATIONS Hibernate

4 APPLICATION ENGINE Data Combination Engine

3 DATA MANAGEMENT

Workflow Execution Engine

Taverna

2 WORKFLOW ENGINE

WEB SERVICE

1 WORKFLOWS

WEB SERVICE

WEB SERVICE


WORKFLOW COMPOSITION ARCHITECTURE API 5 CLIENT APPLICATIONS Hibernate

4 APPLICATION ENGINE Data Combination Engine

3 DATA MANAGEMENT

Workflow Execution Engine

Taverna

workflow engine new java-based taverna workflow wrapper enable distributed service orchestration open knowledge provider hub

2 WORKFLOW ENGINE

WEB SERVICE

WEB SERVICE

WEB SERVICE

interaction xsd xml schema for service composition normalize service input and output

1 WORKFLOWS

enable autonomous data exchanges


HIGHLIGHTS workflow execution engine wrap taverna execution online real-time activity processing

new interoperability standard communication language to enable automated data exchanges from multiple providers ease the creation of distributed service composition workflows

distributed scientific workflows complex interactions between heterogeneous services distributed signal substantiation tasks

web-based workspace always-available collaborative environment custom on-demand data analysis

a new strategy for advanced service composition that enables collaborative & distributed research

!


RESULTS

EU-ADR WEB PLATFORM http://bioinformatics.ua.pt/euadr


RESULTS

EU-ADR WEB PLATFORM http://bioinformatics.ua.pt/euadr

a bioinformatics hub for pharmacovigilance knowledge providers Pedro Lopes, David Campos, Tiago Nunes, José Luís Oliveira acm transactions on management information systems [ongoing] november 2012

the eu-adr web platform: delivering advanced pharmacovigilance tools José Luís Oliveira, Pedro Lopes, Tiago Nunes, David Campos, S Boyer, E Ahlberg, EM Van Mulligen, JA Kors, B Singh, LI Furlong, F Sanz, A Bauer-Mehren, MC Carrascosa, J Mestres, P Avillach, G Diallo, C Diaz, J Van der Lei pharmacoepidemiology and drug safety [second revision] october 2012

automatic filtering and substantiation of drug safety signals A Bauher-Mehren, EM van Mulligen, P Avillach, MC Carrascosa, B Singh, R GarciaSerna, Pedro Lopes, José Luís Oliveira, G Diallo, J Mestres, E Ahlberg Helgee, S Boyer, F Sanz, JA Kors, LI Furlong plos computational biology march 2012


NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS

workflow-based strategies for service composition resource integration approaches for service composition enhancing service composition with the semantic web and rad


NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS

workflow-based strategies for service composition resource integration approaches for service composition enhancing service composition with the semantic web and rad


INTEGRATION CHALLENGE

UNDERSTAND CHANGES IN OUR GENETIC SEQUENCE genetic dataset aggregation extract data from distributed lsdbs

genotype-to-phenotype integration enrich human variome data with connections to multiple data types

content accreditation promote correct authorship, ownership and attribution

previous work scientific workflows ideal for service-service interactions service composition for resource integration

design a service composition strategy to enable agile integration of enriched human variome knowledge

?


INTEGRATION STRATEGY

deliver knowledge web application variome api

API

APPLICATION ENGINE GENE LIST

LOVD

UMD

IDB

LSDB LIST

Other

external data miscellaneous heterogeneous sources from gene ontology to protein databases

FEED READER

distributed lsdbs custom variant readers and web crawling multiple non-standardized formats

ARABELLA


INTEGRATION ARCHITECTURE API 5 CLIENT APPLICATIONS

4 APPLICATION ENGINE

BUILD ENGINE 3 BUILD ENGINE

CSV 2 INTEGRATION MIDDLEWARE

1 CONFIGURATION

XML

SQL

REST


INTEGRATION ARCHITECTURE api

API

rest api for external, enables service composition scenarios

5 CLIENT APPLICATIONS

extensible data model core (gene + variant) plus extensions lightweight link-based connections

4 APPLICATION ENGINE

advanced integration engine

BUILD ENGINE

service composition for intelligent lsdb data extraction

3 BUILD ENGINE

variation dataset enrichment

CSV

XML

SQL

REST

data gathering wrappers configurable resources

2 INTEGRATION MIDDLEWARE

load data from csv, xml, sql and rest sources

flexible configuration single resource setup file 1 CONFIGURATION

innovative service composition description schema


HIGHLIGHTS variation integration service composition approach to gather genetics dataset from multiple external resources

extensible model simplified description and addition of new external service composition actors

interoperability unique service composition variome api access to curated collection of genetic variants

innovative ui gene mesh and liveview content accreditation

new service composition methods allow understanding, exploring and connecting human variome knowledge

!


RESULTS

WAVE: WEB ANALYSIS OF THE VARIOME http://bioinformatics.ua.pt/wave


RESULTS

WAVE: WEB ANALYSIS OF THE VARIOME http://bioinformatics.ua.pt/wave

wave: web analysis of the variome Pedro Lopes, Raymond Dalgleish and José Luís Oliveira human mutation march 2011

an extensible platform for variome data integration Pedro Lopes and José Luís Oliveira 10th ieee international conference on information technology and applications in biomedicine corfu, greece, november 2010

a holistic approach for integrating genomic variation information Pedro Lopes and José Luís Oliveira 10th spanish symposium on bioinformatics malaga, spain, october 2010


NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS

workflow-based strategies for service composition resource integration approaches for service composition enhancing service composition with the semantic web and rad


NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS

workflow-based strategies for service composition resource integration approaches for service composition enhancing service composition with the semantic web and rad


NEXT-GENERATION BIOMEDICAL APPLICATIONS

life sciences complex mesh of data and services modern demands keep pushing computer science forward

previous work new service composition strategies for interoperability using scientific workflows new service composition strategies for resource integration

rapid application development easily-configurable service composition implementation environment build the next generation of service composition applications faster

the semantic web paradigm the perfect solution for life sciences natural complexity


NEXT-GENERATION BIOMEDICAL APPLICATIONS enhanced rapid application development straightforward service composition setup process

advanced data integration engine rich service composition description, flexible integration from heterogeneous resources

semantic knowledge management involve semantic web technologies at all service composition layers

state-of-the-art interoperability apis enable service composition for everything and everyone

knowledge federation enable distributed access to knowledge

empower ecosystems delivery network for custom cross-platform and cross-device applications

design a new semantic web framework to streamline the creation of next generation biomedical applications

?


FRAMEWORK MODEL SINGLE INSTANCE

integration

DATA INTEGRATION CONNECTORS CSV

XML

SQL

SPARQL

data in

foaf:name

dc:title

rdfs:label

owl:imports

KNOWLEDGE BASE

REST

JAVA

LDATA API

SPARQL

interoperability data out


FRAMEWORK MODEL FEDERATION

DATA INTEGRATION CONNECTORS CSV

XML

SQL

SPARQL

DATA INTEGRATION CONNECTORS CSV

XML

SQL

DATA INTEGRATION CONNECTORS SPARQL

foaf:name

dc:title

rdfs:label

CSV

XML

SQL

SPARQL

LDATA

SPARQL

owl:imports

KNOWLEDGE BASE KNOWLEDGE BASE

REST

JAVA

LDATA

API

KNOWLEDGE BASE

SPARQL

REST

JAVA

LDATA

SPARQL

API

KNOWLEDGE FEDERATION LAYER

REST

JAVA

API


FRAMEWORK ARCHITECTURE 6 CLIENT APPLICATIONS

REST

Java

pubby

Joseki

LinkedData

SPARQL

5 API

4 APPLICATION ENGINE

3 KNOWLEDGE BASE

ABSTRACTION ENGINE 2 INTEGRATION ENGINE

CSV 1 EXTERNAL SOURCES

XML

SQL

SPARQL


FRAMEWORK ARCHITECTURE modern client-side development

6 CLIENT APPLICATIONS

ready for any ui framework

future-proof interoperability REST

Java

pubby

Joseki

LinkedData

SPARQL

rest services sparql endpoint + linkeddata interfaces available by default java + javascript libraries

5 API

streamlined application engine “semantic web in a box” 4 APPLICATION ENGINE

straightforward backend deployment with tomcat

semantic knowledge management mysql-based triplestore 3 KNOWLEDGE BASE

jena-supported methods

advanced integration engine

ABSTRACTION ENGINE

semantic web translation

2 INTEGRATION ENGINE

new extract-transform-load strategy

CSV

XML

SQL

SPARQL

flexible service composition configuration comprehensive connectors & selectors

1 EXTERNAL SOURCES

load data from csv, xml, sql and sparql sources


HIGHLIGHTS rapid application development quickly build a new service composition application ecosystem

semantic data integration platform flexible acquisition and translation of data from heterogeneous resources

semantic web and linkeddata interoperability future-proof interoperability with the most innovative application paradigm semantic reasoning and inference

federation rest, sparql and linkeddata apis one query, multiple knowledge bases

a framework to empower the creation of next-generation service composition-based semantic software

!


RESULTS COEUS

http://bioinformatics.ua.pt/coeus


RESULTS COEUS

http://bioinformatics.ua.pt/coeus

coeus: “semantic web in a box” for biomedical applications Pedro Lopes and José Luís Oliveira journal of biomedical semantics [second revision] october 2012

coeus: a semantic web application framework Pedro Lopes and José Luís Oliveira 4th international semantic web applications & tools for life sciences workshop london, united kingdom, december 2012

a semantic web application framework for health systems interoperability Pedro Lopes and José Luís Oliveira international workshop on managing interoperability and complexity in health systems glasgow, scotland, november 2011

towards knowledge federation in biomedical applications Pedro Lopes and José Luís Oliveira 7th international conference on semantic systems graz, austria, october 2011


NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS

workflow-based strategies for service composition resource integration approaches for service composition enhancing service composition with the semantic web and rad


NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS

workflow-based strategies for service composition resource integration approaches for service composition enhancing service composition with the semantic web and rad coeus evaluation


LEGACY DISEASECARD


LEGACY DISEASECARD

rare diseases research portal collection of pointers to disease pages used for academia and clinical research

primitive engineering static integration/navigation protocol constrained data model

evolution imperative to update scientific dataset new features for users and developers

evaluation perfect benchmark for a new coeus instance


CREATING A NEW COEUS INSTANCE

setup new application model re-use existing ontologies improve omim basic model

configure service composition define resource descriptions for connectors specify data selectors

# locus entity

:entity_Locus :isEntityOf :concept_Ensembl, :concept_EntrezGene, :concept_GeneCards, :concept_HGNC, :concept_MapView, :concept_UCSC; :isIncludedIn :seed_Diseasecard4; dc:description "Collects Locus entity knowledge."^^xsd:s dc:title "Locus"^^xsd:string; a :Entity, owl:NamedIndividual; rdfs:label "entity_Locus"^^xsd:string;

# hgnc concept

:concept_HGNC :hasEntity :entity_Locus; :hasResource :resource_HGNC; :isExtendedBy :resource_ClinicalTrials, :resource_ENZYME, autonomous process :resource_Ensembl, :resource_HGNC, pull data from configured resources into semantic knowledge base :resource_KEGG, :resource_MedlinePlus, :resource_UniProt; dc:description "Concept relating HGNC data."^^xsd:string dc:title "HGNC"^^xsd:string; a :Concept, owl:NamedIndividual; rdfs:label "concept_hgnc"^^xsd:string;

build triplestore

deliver knowledge

use apis to create advanced user interfaces use apis to access data


CREATING A NEW COEUS INSTANCE # hgnc connector

setup new application model re-use existing ontologies improve omim basic model

configure service composition define resource descriptions for connectors specify data selectors

:resource_HGNC :endpoint "http://www.genenames.org/cgi-bin/hgnc_downlo +data&col=gd_hgnc_id&col=gd_app_sym&col=gd_app_name&col=gd_pub_chrom_m %27#replace# %27&order_by=gd_app_sym_sort&limit=&format=text&submit=submit&.cgifiel dbtag"^^xsd:string; :extends :concept_HGNC; :extension "rdfs:label"^^xsd:string; :hasKey :csv_HGNC_id; :isResourceOf :concept_HGNC; :loadsFrom :csv_HGNC_id, :csv_HGNC_name; :method "complete"^^xsd:string; :order 3 ; dc:description "Resource connecting gene HGNC informati dc:publisher "csv"^^xsd:string; dc:title "HGNC"^^xsd:string; a :Resource, owl:NamedIndividual; rdfs:label "resource_hgnc"^^xsd:string;

# hgnc identifier selector

:csv_HGNC_id :isKeyOf :resource_HGNC; :loadsFor :resource_HGNC; :property "dc:source|dc:identifier"^^xsd:string; :query "0"^^xsd:string; autonomous process dc:description "Information for HGNC CSV resource loading dc:title "HGNC_id"^^xsd:string; pull data from configured resources into semantic knowledge base a :CSV, owl:NamedIndividual; rdfs:label "csv_hgnc_id"^^xsd:string;

build triplestore

deliver knowledge use apis to create advanced user interfaces use apis to access data

# hgnc name selector

:csv_HGNC_name :loadsFor :resource_HGNC; :property "rdfs:comment|dc:description"^^xsd:string; :query "2"^^xsd:string; dc:description "Information for HGNC CSV resource loadi dc:title "HGNC_name"^^xsd:string; a :CSV, owl:NamedIndividual; rdfs:label "csv_hgnc_name"^^xsd:string;


CREATING A NEW COEUS INSTANCE

setup new application model re-use existing ontologies improve omim basic model

DISEASE

GENE

OMIM

HGNC

0

configure service composition define resource descriptions for connectors specify data selectors

autonomous process

deliver knowledge use apis to create advanced user interfaces use apis to access data

GENE

ONTOLOGY

UniProt

Entrez

HPO

PROTEIN

DRUG

ONTOLOGY

InterPro

DrugBank

MeSH

1

build triplestore pull data from configured resources into semantic knowledge base

PROTEIN

PDB

2

3

PROTEIN

GENE

ONTOLOGY

DRUG

Prosite

Ensembl

UMLS

PharmGKB

LITERATU

Pubmed


CREATING A NEW COEUS INSTANCE # java

pt.ua.bioinformatics.API.getTriple(“coeus:hgnc_BRCA2”, ”p”, ”o”, “xml”)

setup new application model re-use existing ontologies

# rest

http://bioinformatics.ua.pt/coeus/api/triple/coeus:hgnc_BRCA2/p/o/csv

improve omim basic model # sparql federation

PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX diseasome: <http://www4.wiwiss.fu-berlin.de/diseasome/resource/d PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX coeus: <http://bioinformatics.ua.pt/coeus/> SELECT ?pdb ?mesh WHERE {{ define resource descriptions for connectors SERVICE <http://www4.wiwiss.fu-berlin.de/diseasome/sparql> { specify data selectors <http://www4.wiwiss.fu-berlin.de/diseasome/resource/gen } }{ SERVICE <http://bioinformatics.ua.pt/coeus/sparql> { _:gene dc:title ?label . _:gene coeus:isAssociatedTo ?uniprot } autonomous process }{ SERVICE <http://bioinformatics.ua.pt/coeus/sparql> pull data from configured resources into semantic knowledge base { ?uniprot coeus:isAssociatedTo ?pdb . ?pdb coeus:hasConcept coeus:concept_PDB } }{ SERVICE <http://bioinformatics.ua.pt/coeus/sparql> { ?uniprot coeus:isAssociatedTo ?mesh . use apis to create advanced user interfaces ?mesh coeus:hasConcept coeus:concept_MeSH } use apis to access data } }

configure service composition

build triplestore

deliver knowledge


RESULTS

THE NEW DISEASECARD http://bioinformatics.ua.pt/dc4


RESULTS

THE NEW DISEASECARD http://bioinformatics.ua.pt/dc4

easy setup simplified resource integration straightforward client-side application creation

efficient development rapid application development at its best reduced implementation effort compared to similar systems

improved availability available to researchers through web application available to developers through default apis


NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS

workflow-based strategies for service composition resource integration approaches for service composition enhancing service composition with the semantic web and rad


NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS

CONCLUSIONS

1

interoperability

2

integration

3

service composition

new strategies for workflow-based service composition advanced methods to deliver knowledge

innovative integrative approach to describe service composition flexible integration engine to compose heterogeneous resources

pioneering framework for enhanced semantic web-based service composition next-generation strategies for integration and interoperability


NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS

future perspectives


NEW STRATEGIES TO IMPROVE AND ADD VALUE TO SERVICE COMPOSITION SCENARIOS

future perspectives


FUTURE PERSPECTIVES beyond service composition software-as-a-service use is increasing streamlined and lightweight interactions are everywhere

linkeddata and the semantic web semantic web as the foundation for new software engineering strategies linkeddata is a growing knowledge network

worldwide knowledge networks more sophisticated knowledge expression technologies richer, meaningful data are more connected

modern software platforms growing relevance of efficient content delivery one knowledge base, multiple cross-platform & cross-device clients

business value research and enterprise are intertwined coeus use goes beyond science


THANK YOU


SERVICE COMPOSITION FOR BIOMEDICAL APPLICATIONS Pedro Lopes pedrolopes@ua.pt

supervisor José Luís Guimarães Oliveira universidade de aveiro

doctoral programme in informatics engineering october 1st, 2012

jury Artur Manuel Soares da Silva universidade de aveiro Víctor Maojo García universidade politécnica de madrid Rui Pedro Sanches de Castro Lopes escola superior de tecnologia e gestão do instituto politécnico de bragança Francisco José Moreira Couto universidade de lisboa Carlos Manuel Azevedo Costa universidade de aveiro


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.