State of Open Source in Healthcare Research Informatics Dan “The Dude” Housman Managing Director, Analytical Applications/Open Source Evangelist
April 4, 2011
Copyright © 2010 Recombinant Data Corp. All rights reserved.
1
Comparative Effectiveness Research - 1747
Treatment Quart of cider daily
X 25 drops of
Josef Lind 1747 HMS Salisbury Bay of Biscay 12 scurvy cases
N
Result
2 Some improvement 2 No effect
(sulfuric acid) X 6 spoonfuls vinegar
2 No effect
X ½ pint seawater
2 No effect
2 oranges and one lemon
X Spicy paste plus
2 (Ran out of fruit after 6 days) – 1 fit for duty, 1 recovered 2 No effect
barley water drink 2
1847 Midwives vs. Medical Students (vs. Street births)
Lower is better
Ignaz Semmelweiss Wien Maternity, Vienna 1847
First Clinic: Medical students Second clinic: Midwives
3
Extra, Extra - Washing hands can save lives…..
“The thing that kills women with [puerperal fever] …is you doctors that carry deadly microbes from sick women to healthy ones.” Semmelweis, 1849 4
Final Results: System protected status quo
1848: Carl Edvard Marius Levy “with due respect for the cleanliness of the Viennese students, it seems improbable that enough infective matter or vapor could be secluded around the fingernails to kill a patient… his opinions are not clear enough and his findings not exact enough to qualify as scientifically founded.” 1849: Employment terminated in Vienna 1850-1860: Handwashing introduced briefly in other sites 1861: Publishes The Etiology, Concept and Prophylaxis of Childbed Fever 1862: Open letter to critics “irresponsible murderers and ignoramuses”. 1865: Dies in mental institution (age 47). Treated with castor oil, cold water, straight jacket, beaten from escape attempts. Cause of death – pyemia infection caused by beatings.
5
Reversion and redemption
Today? 4 million births/yr US 2%=80,000 8%=320,000 15%=600,000
Louis Pasteur 1862 – germ theory established from flask experiments 6
To Err is Human
At least 44,000 people, and perhaps as many as 98,000 people, die in hospitals each year as a result of medical errors that could have been prevented, according to estimates from two major studies. Institute Of Medicine 1999
7
To Err is Human
Despite finding small improvements at the margins—fewer patients dying from accidental injection of potassium chloride, reduced infections in hospitals due to tightened infection control procedures—it is harder to see the overall, national impact, Leape and Berwick say. "[T]he groundwork for improving safety has been laid in these past five years but progress is frustratingly slow," Institute of Medicine Oct. 2005 8
To Err is Human
A disturbing report released recently by (AHRQ), found measurable improvement in fewer than half of the 38 patient safety measures examined. Research shows it takes 17 years before evidence-based practices are incorporated into widespread clinical use.
Institute of Medicine May 2009 9
This improving Medicine stuff is hard!
10
Challenges in healthcare IT and research
• How can we run ad-hoc “scurvy” comparative effectiveness experiments in growing EHR systems? • Evidence must be an open process. Inquiry must be a central part of practice in translational and personalized medicine concepts. • Rapid healthcare system changes will harm patients. How will we know? • Current expenditures for research are ~$20-30Billion and it results in a pipeline of 30 new drugs per year… many that are similar. • Translation of evidence to practice is lacking.
Something needs to change! 11
Openness is needed
12
Mission
To improve the quality of patient care and efficiency of medical research by delivering a reliable flow of clinical data and innovative software applications for our clients.
13
About My Company – Why I’m Here! Best of breed software/analytics company focused 100% on secondary uses of clinical data Core Competencies • Clinical & research data warehousing • Reporting and analytics • Data strategy, governance & compliance • Application integration • Open source software (social networking, clinical research, caBIG) Core Values • Pragmatism • Trust • Effective Communication 14
Research Study Process
15
Clinical research data flow model
16
Getting things done with Open Source in Healthcare
17
External Open Source Yardstick? Drupal
560,699 people in 228 countries* speaking 182 languages power Drupal 18
WWDBD - What would Dries Butaert do?
Six Open Source Secrets from Dries‌..(summary)
There is no quick rich formula: Build a user conference from in 40 people 2005 to 3,000 in 2010. Have many meet-ups. Be patient. Hurray for growing pains: Funding comes if you are serving the community and they will support you as you grow out of your current capability. Build an architecture for evolution: Allow external groups to be able to submit. Provide the right tools: Processes and tools. Replace planning with co-ordination. Make money but pay with trust: The open source currency is trust. Leadership trumps management: Make everyone a respected leader and follower
19
Some Open Source projects supporting this mission
Indivo PCHR caTissue Tissue Bank
SMArt Substitutable Medical Applications i2b2 Cohorts ++
Profiles Research Networking
Applications Pentaho Kettle
Mirth HL7
Shibboleth Single sign-on
Data management Infrastructure SVN/Hudson/Eclipse Development tools
Java J2EE
MySQL Postgres
Core Infrastructure 20
Scorecard
21
Infrastructure Components
22
Data Mgmt. Tools: Kettle and Mirth
Kettle: Allows for deployable/sharable “free” ETL (Extract Transform and Load) without dependency on specific data warehouse implementation (Oracle, MSSQL, Informatica, DataStage, etc.). With support options from Pentaho.
23
Infrastructure scorecard (IMHO)
24
Infrastructure state: Early adoption…
• Strengths Cost factors highly attractive if alternative not in use Open approach allows sharing within open projects Support contracts/company success aids in adoption Capabilities often equal or better in open tools Trust erosion from commercial vendors Rapid innovation cycles Community engagement in “vertical frameworks” e.g. Mirth • Challenges Limited trust in open source at enterprise/CIO level today Commercial databases most common in open projects Embedding (e.g. SSIS) offer integrated non-open approach Frequent infrastructure vendor FUD – “It’ll never work!” Confusion on “why do I pay” for commercial support Splits – Pentaho & Talend shops divide open community Feature gap resistance/limitations 25
i2b2 Informatics for Integrating the Bench and the Bedside A success story in academic open source‌ But a work in progress‌
26
i2b2: Overview of Research Workflow
Step 2
Data Collection
Clinical Data Warehouse
source data extracted
(PHI) Integration ETL
(data extracts and HL7 feeds)
Step 6
Operational Use – QR/QI
Step 3 Create “Limited Data Set”
(destroy identifiers)
EMR Claims EMPI
HIPAA WALL
Step 1
Step 5 i2b2 queries IRB protocol request
Step 4 One-time Data Use Agreement i2b2 research mart (Limited Data Set)
Step 7
IRB approval
Step 8
IRB-Approved PHI is retrieved
Schedule
Step 9
Etc.
i2b2 research mart with patient identifiers
SUBMIT
Step 10
Re-issue query to data mart with PHI
Data Center (Honest Broker and server support)
27
i2b2 History i2b2 is linked to over $130M in active research projects at Partners, involving 892 data marts, 1867 users, 315 teams, and 12,000 queries per year
1999 RPDR initially established at Partners
2004 i2b2 grant to “port RPDR� as an open source framework
2000-2004 RPDR and tool evolution at Partners
2006 First beta release of i2b2
2008 Strong adoption
2010 i2b2 renewal
2009 De-facto CTSA standard CRDW
* RPDR (Research Patient Data Registry) was the precursor to i2b2
28
i2b2 Hive
29
i2b2 Open Source Scorecard (IMHO)
30
Adoption
31
Ecosystem for Recombinant (and i2b2)
32
Good for business (Service opportunities)
Data sets/Reports i2b2 workbench and plug-ins Ontology navigator i2b2 server / security CRC Data repository Custom Java extensions Data extraction systems Data quality (cleansing rules/logic) Maintenance / application support Hw/sw/network operations Interoperability with CTSA sites Collaboration with HIT groups IRB/HIPAA auditing Data governance/Management Indemnification/Risk
Benefits
TCO factors
33
The Johnny Appleseed principle…
A land grant required a settler to “set out at least fifty apple or pear trees” as a condition of the deed. 34
Good for business - Grants
Install i2b2 properly
Get better scores on grant proposals
Increase research capacity
35
Good for Business (Grants)
Seven of twelve sites were planning a data warehouse (4 already had data warehouses – Columbia, Duke, Rockefeller and Pitt)
Michael Becich CTSA review AMIA 2008
GO grants with i2b2 = 5 CTSA Supplementals, Prospect, etc.
U Mass 2010… 11 (lowest score) funded! U MN 2011… from 45 to under 15!
36
i2b2 extensions: Hacktivation
37
Top 10 list for i2b2’s open source success
1. Strong leadership Zak/Shawn/Suzanne (Harvard) 2. CTSA grant cycles create demand 3. Community support/dialog – AUG & outreach 4. NCBC support grant drivers 5. Software development execution (Shawn Murphy) 6. Commercial implementation and pharma partnership 7. Not force a specific ontology/way of working 8. SHRINE federation spreads/standardizes install needs 9. Simplicity “cohort” core – like Twitter or Google 10. Proven prior success with RPDR in use
38
Gap: Why are there few to no “external commits”?
• • • • • • •
Grant funding/credit incentives: innovation vs. co-ordination Don’t expose too soon approach Can the community contribute? Legacy of PHI/development framework TRUST Budget allocation Maturity
What happens when funded research and development ends?
39
Gaps: Many sites are “early/mid-adoption”
Why? • Governance challenges: unclear legislation/interpretation • Many groups that can say “no” during process • Budget cuts/economic impacts on projects • Interference from Meaningful Use & ACO initiatives • Changing cultures is harder than adding tools • Still immature software/incomplete • Implementations are complex, require data • Trust among parties is still limited 40
Lessons learned to date
• Be patient with academic projects. • Building Trust requires demonstrating delivery not talk • Just make it work… the rewards come later • Avoid splintering projects BUT keep forwards rapidly • Academic open source projects naturally oversell themselves which create expectation problems • Open source culture in healthcare/academia is unique and evolving (desire to control is embedded) 41
Open data + open software
42
Clinical Intelligence problem dimensions
• Compound • Biomarker • • • •
SNPs NextGen Seq. mRNA ELISA
• Disease • Indication • Outcome • Internal results • Public results • Experimental platforms
• Trial phase • Subjects • Researchers • Resources • Finance • Literature • Clinical trials • Images
43
Infrastructure Solution
44
Mapping capabilities to processes
45
tranSMART consortium Consortium Members Pharma J & J
Provide r St. Jude
Millenniu m
UCSF
Others
CINJ
Int’l U-BIOPRED IMI
External Data Amazon Cloud
tranSMART.org Private Access
Pipelines / Search
Data Services
Shared Data
Public Access
Public Users Public Pharma
Public Provider
Open Source
Curation Mgmt.
46
Sage Bionetworks Mission
Sage Bionetworks is a non-profit organization with a vision to create a “commons� where integrative bionetworks are evolved by contributor scientists with a shared vision to accelerate the elimination of human disease Building Disease Maps
Data Repository
Commons Pilots
Discovery Platform
Sagebase.org 47
Lessons learned to date…
• Pharma is in the science business and will move to open frameworks if provided • Strong support is available for a new strategy • Progressive leadership is critical (e.g. E. Perakslis J&J) • Openness with clinical research data is an integrated problem • Early challenges with commercial conflicts • Tool/Data adoption will be a challenge • Be patient…..
48
REDCap
49
Openness -> Adoption -> Impact
• Wikipedia vs. printed encyclopedia • Proprietary or pure open tools struggle for adoption in AMCs vs. REDCap • Consortium model
50
Why is REDCap successful?
1. 2. 3. 4. 5. 6. 7. 8.
Leadership – Paul Harris & Vanderbilt Simple core capability CTSA standards conversion CTSA support for Vanderbilt/incentives Engineering execution Strong community engagement (calls/etc.) Consortium commits to community development Rapid implementation (hacktivation)
51
REDCap challenges
• Not open to non-academics • Highly controlled (e.g. no ports to commercial DBs) • Difficult to mix models (open and “academic consortium”) • Success of model discourages open source • EDC in regulated environments (Pharma?) • Needs to be paired with other EDC options
52
Profiles Research Networking Software
53
Social Networks (Facebook for researchers)
54
Professional Open Source Model
55
The Crowd and the Cloud
56
Profiles Score Card
57
Lessons learned to date
• Real pressure to us is on marketing/dissemination • Lower grants drive deeper commitment to open source • Academic roadmaps overly optimistic • Engagement overlaps on “who gets funded” • Commercial client expectations high for low price points at small scale • Need processes to match commercial roadmap and open project roadmap
58
Westinghouse vs Edison – AC vs DC
1893 Niagra falls contract rising dominance of AC
1903 Topsy “bad” elephant electrocuted High voltage AC (Edison “reinvents” FUD) 59
Change is resisted by very smart people FUD is an IT institution
60