PhD Proposal - Service Composition for Biomedical Applications

Page 1

Service Composition in Biomedical Applications Pedro Lopes pedrolopes@ua.pt PhD Thesis Proposal Programa Doutoral em Engenharia Informática December 17th, 2009

Research Supervisor: José

Luís Oliveira jlo@ua.pt


Outline ‣ Introduction ‣ Bioinformatics ‣ Objectives ‣ Problems & Requirements ‣ Technologies ‣ Strategies ‣ Workplan ‣ What’s Next?


Introduction ‣ Internet (and computer science) is suffering a (r)evolution! • New application paradigms ‣ Web access anywhere, anytime and to everyone • Static • Mobile

‣ The platform for everything • New opportunities • New challenges


Introduction ‣ Internet (and computer science) is suffering a (r)evolution! • New application paradigms ‣ Web access anywhere, anytime and to everyone • Static • Mobile

‣ The platform for everything • New opportunities • New challenges

Data

Information

Knowledge


Bioinformatics

[Motivation]

8 bits - 1 byte

10

________

ATCG

________


Bioinformatics

[Motivation]

8 bits - 1 byte

10 ATCG

10110011 ________

256 combs


Bioinformatics

[Motivation]

8 bits - 1 byte

10

10110011

256 combs

ATCG

ACCGTTAG

65536 combs

Wonderful Complexity!


[Contextualization] Bioinformatics

‣ It all started in the Human Genome Project... • Immense amount of data ‣ New technologies to deal with the “Book of Life”

‣ New projects were born • More data! ‣ Need for improved, next-generation applications


[Contextualization] Bioinformatics

‣ It all started in the Human Genome Project... • Immense amount of data ‣ New technologies to deal with the “Book of Life”

‣ New projects were born • More data! ‣ Need for improved, next-generation applications

Data

Information

Knowledge


[Landscape] Bioinformatics

‣ Databases • KEGG, UniProt, EBI, NCBI, LOVD, UMD...

(150 in GeNS)

‣ Service protocols • DAS, BioMart, EMBOSS, Soaplab, WABI, BioMOBY

‣ Integration applications • DiseaseCard, GeneBrowser, GeNS, ... • Taverna, Bioclipse... • Biozon, Bioconductor, Entrez, Ensembl, ... • Bio2RDF, RDF Scape, ...

‣ Previous research • DynamicFlow


Objectives ‣ Dig deep in the life sciences research field • Understand the problems • Study state-of-the-art

‣ Propose solutions • Analyze the requirements • Develop framework ‣ Internal and external usage

‣ Publish


Objectives ‣ Dig deep in the life sciences research field • Understand the problems • Study state-of-the-art

‣ Propose solutions • Analyze the requirements • Develop framework ‣ Internal and external usage

‣ Publish

Promote research and development of novel, nextgeneration frameworks and strategies to enhance life sciences web applications and systems


Roadmap Problems & Requirements

Technologies

‣Heterogeneity ‣Integration ‣Interoperability ‣Description

‣Web-based access ‣Web Services ‣GRID ‣Semantic Web Strategies

‣Static Apps ‣Dynamic Apps ‣Meta Apps


Roadmap Problems & Requirements

Technologies

‣Heterogeneity ‣Integration ‣Interoperability ‣Description

‣Web-based access ‣Web Services ‣GRID ‣Semantic Web Strategies

‣Static Apps ‣Dynamic Apps ‣Meta Apps


[Problems & Requirements] Heterogeneity

‣ Subject of many research projects ‣ Occurs at various levels Physical

‣Web Server ‣FTP Server ‣File Server ‣Backup Tape

Logical

‣Relational

Database ‣OO Database ‣Text File ‣Binary File

Format

‣HTML ‣CSV ‣XML ‣TXT ‣Excel

Model

‣Structure ‣Ontology ‣Semantics

Access

‣Local ‣Remote APIs ‣Web Services


[Problems & Requirements] Integration

‣ To deal with resource heterogeneity

Centralized (...) to Distributed (...) ‣ Various solutions


[Problems & Requirements] Integration

‣ To deal with resource heterogeneity

Centralized (...) to Distributed (...) ‣ Various solutions Warehouse

Mediator

Link

App

App

App

Mediator


[Problems & Requirements] Integration

‣ To deal with resource heterogeneity

Centralized (...) to Distributed (...) ‣ Various solutions Warehouse

Mediator

Link

App

App

App

HybridMediator framework!


[Problems & Requirements] Interoperability

‣ Facilitate integration and communication between applications Conceptual interoperability Dynamic interoperability Pragmatic interoperability Semantic interoperability Syntactic interoperability Technical interoperability No interoperability

Increasing capability for interoperation


[Problems & Requirements] Description

‣ Resource description is the key for integration and interoperability • Provide meaning to content

‣ Apply area-specific terminology • Ontology

• An extra-effort to resource publishers • Will be very important in the future Internet


Roadmap Problems & Requirements

Technologies

‣Heterogeneity ‣Integration ‣Interoperability ‣Description

‣Web-based access ‣Web Services ‣GRID ‣Semantic Web Strategies

‣Static Apps ‣Dynamic Apps ‣Meta Apps


Roadmap Problems & Requirements

Technologies

‣Heterogeneity ‣Integration ‣Interoperability ‣Description

‣Web-based access ‣Web Services ‣GRID ‣Semantic Web Strategies

‣Static Apps ‣Dynamic Apps ‣Meta Apps


Web

[Technologies] services

‣ Applications need to communicate with each other through the web ‣ Most widely used technology for the development of distributed web applications • SOAP

Service Broker

• REST • XMPP

UDDI

L

W

SD

SD

L

W

Service Requester

SOAP

Service Provider


GRID and Semantic

[Technologies] Web

‣ GRID • Combination of software and hardware infrastructures ‣ Pervasive, Consistent, Low-cost, • Various GRID types (Computing, Data, Knowledge)

‣ Semantic Web • Resource Description ‣ Complete framework • OWL + RDF + SPARQL, Microformats

• Link available resources in a meaningful way for both Humans and Machines


Roadmap Problems & Requirements

Technologies

‣Heterogeneity ‣Integration ‣Interoperability ‣Description

‣Web-based access ‣Web Services ‣GRID ‣Semantic Web Strategies

‣Static Apps ‣Dynamic Apps ‣Meta Apps


Roadmap Problems & Requirements

Technologies

‣Heterogeneity ‣Integration ‣Interoperability ‣Description

‣Web-based access ‣Web Services ‣GRID ‣Semantic Web Strategies

‣Static Apps ‣Dynamic Apps ‣Meta Apps


Static or Dynamic

[Strategies] Applications

‣ Static ‣ Solve all the problems... “by hand”! • Hard-coded integration, interoperability and Description ‣ Not a very clever solution • Adequate to (very) small projects

‣ Dynamic ‣ Take advantage of novel concepts • Description + Composition • Intelligent mechanisms for input/output combinations ‣ Generic • Suitable for the majority of scenarios


Meta

[Strategies] Applications

‣ Applications running applications • Like metadata is data about data

‣ Software-as-a-service • Service Oriented Architectures

‣ Mashups • Workflows


Meta

[Strategies] Applications

‣ Applications running applications • Like metadata is data about data Activity 1b In: A - Out: X

‣ Software-as-a-service • Service Oriented Architectures

Activity 2b In: X - Out: Y

‣ Mashups Activity 1a In: A - Out: B

• Workflows

Activity 2a In: B & Z - Out: C

‣ Advanced

usage

Activity 4 In: C - Out: D

Activity 5 In: D - Out: Final

Activity 3 In: Y - Out: Z


[Calendar] Workplan

‣ Thesis

Year 1 Q1

Q2

Q3

Year 2 Q4

Q1

Q2

Q3

Year 3 Q4

Q1

Q2

Q3

Year 4 Q4

Q1

Q2

Q3

Q4

State of the Art Domain Analysis Proposal Main corpus Delivery

‣ Software

Year 1 Q1

Q2

Q3

Year 2 Q4

Q1

Q2

Q3

Year 3 Q4

Q1

Q2

Q3

Year 4 Q4

Q1

Q2

Q3

Q4

Preliminary Research System Analysis Modelling Active Development Deliveries

‣ Publications

Year 1 Q1

High Impact Factor Medium Impact Factor

Q2

Q3

Year 2 Q4

Q1

Q2

Q3

Year 3 Q4

Q1

Q2

Q3

Year 4 Q4

Q1

Q2

Q3

Q4


[Publications] Workplan

‣ Medium impact factor • International Conferences & Workshops

‣ High impact factor • Science, BMC Bioinformatics, Hindawi, Oxford Journals

‣ Published work • Dynamic Service Integration using Web-based Workflows ‣

10th International Conference on Information Integration and Web Applications and Services; Linz, Austria; November 2008

• DynamicFlow: A Client-side Workflow Management System ‣

3rd International Workshop on Practical Applications of Computational Biology; Salamanca, Spain; June 2009

• Arabella: A Directed Web Crawler ‣

International Conference on Knowledge Discovery and Information Retrieval; Madeira, Portugal; October 2009

• Link Integrator: A Link-based Data Integration Architecture ‣

International Conference on Knowledge Discovery and Information Retrieval; Madeira, Portugal; October 2009

• Integration of Variome Data using a Link Discovery Strategy ‣

Iberian Bioinformatics Conference 2009; Lisbon, Portugal; November 2009


What’s Next? Promote research and development of novel, nextgeneration frameworks and strategies to enhance life sciences web applications and systems ‣ Research and Development • Enabling knowledge ‣ Semantic Web as a technology to ease integration and interoperability • Well-defined competition • Ongoing “hands-on” work • Promote internal and external usage

‣ One framework, multiple projects • EU-ADR, GEN2PHEN, DiseaseCard, OralCard, VarCard

‣ Publish


Thank You!


Questions?


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.