KEY CONSIDERATIONS FOR ANALYTIC SOLUTIONS FOR LIFE SCIENCES CHOOSING ABSTRACTION, THE CLOUD AND VISUALIZATION
White Paper
This document contains Confidential, Proprietary and Trade Secret Information (“Confidential Information�) of Radiant Advisors. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or any means electronic or mechanical, including photocopying and recording for any purpose other than the purchaser’s personal use without the written permission of Radiant Advisors.
While every attempt has been made to ensure that the information in this document is accurate and complete, some typographical errors or technical inaccuracies may exist. Radiant Advisors does not accept responsibility for any kind of loss resulting from the use of information contained in this document. The information contained in this document is subject to change without notice.
All brands and their products are trademarks or registered trademarks of their respective holders and should be noted as such.
This edition published May 2014.
Key Considerations for Analytic Solutions for Life Sciences
Table of Contents
Executive Summary
1
Information Challenges in Life Sciences
2
The Integration Challenge
2
The Management Challenge
3
The Discovery Challenge
4
Tackling Today’s Information Challenges in Life Sciences
5
Choosing Data Abstraction for Unification
5
Centralizing Context in the Cloud
5
Getting Visual with Self-Service
6
Selected Examples of Abstration, Cloud, and Visualization
7
Democratizing Information Through Integration
7
Using the Cloud as a Research Enabler
7
Visualizing Social Media Analytics for Life Sciences
8
Conclusion
9
References
10
Key Considerations for Analytic Solutions for Life Sciences
Executive Summary
From pharmaceuticals to global health to the environment, twenty-first century life sciences companies are transforming into data-driven life sciences companies, leveraging vast amounts (and new forms) of data into processes that span from research and development to sales and marketing. As in many industries, the data is explosive: already the rate of data generation in the life sciences has reportedly exceeded that of even the predictions made by Moore’s Law itself, which predicts the steady continuance of technology capacity to double every two years. With every piece of detailed raw data now able to be affordably stored, managed, and accessed, technologies that analyze, share, and visualize information and insights will need to be ubiquitously operated at this scale too. This transformation into data-driven life sciences companies is challenging, not only because of the sheer volume of data to manage, but because to date there has been a lack of data integration agility – a critical success factor in life sciences. Much of the traditional (and even new) approaches to data architecture have led to complex data silos (or, data copies) that offer only an incomplete picture into data, along with slowing down the ability to provide access or gain timely insights. Further, the control of intellectual property and compliance with many regulations also poses a bevy of operational, regulatory, and information governance challenges. Making it even more complex, the very nature of the life sciences environment is that of non-stop change, growth, and financial investment, with sixty-eight percent of life science companies expected to increase overall sales and marketing IT spending over the next fiscal year . Now, a strong emphasis on analytics and data discovery for new insights is introducing additional challenges in how data is leveraged into the fabric of life sciences organizations. Today’s analytic challenges for life sciences companies can be separated into three distinct categories: the integration challenge, the management challenge, and the discovery challenge. The answer to these challenges, however, isn’t the development of new tools or technologies. In fact, the old ways – replication, transformation, or even the data warehouse or new desktop-based approaches to analytics – have met with limited or siloed success: they simply don’t afford an agile enough process to keep up with the insurgence of data size, complexity, or disparity. Nor should life sciences companies rely on the expectation of increased funding to foster additional solutions. Rather, they should turn to collaborative and transformative solutions that already exist. By embracing a data unification strategy through the adoption and continued refinement and governance of a semantic layer to enable agility, access, and virtual federation of data, as well as by incorporating solutions that take advantage of scalable, cloud-based technologies that provide advanced analytic and discovery capabilities -- including visualization -- life sciences companies can continue to become even more data-capable organizations.
Key Considerations for Analytic Solutions for Life Sciences 1
Information Challenges in Life Sciences
Kolker Labs, a research entity that creates partnerships to identify data challenges and provide solutions in the life sciences community, collaborated with researchers at the University of Washington to assess data and analysis needs. Their survey of life scientists indicated the immediate need for tools and resources to easily access publically
In life sciences companies, future successes and discoveries – whether in R&D initiatives to discover new drugs, or in sales and marketing functions remain competitive, address market needs for education and information, and deliver the right doctors and patient services -- hinge on the ability to quickly and intuitively leverage, analyze, and take action on the information housed within data. Unfortunately, core data challenges within the life sciences have noted that existing data tools and resources for analysis lack integration -- or unification of data sources -- and can be difficult to both disseminate and maintain. Further, the life science research literature and testimonies describe another researchimpeding challenge: the management challenge posed by defining access rights and permissions to data, addressing governance and compliance rules, and centralizing metadata management. Finally, balancing the need to enable freedom with new data sources and data discovery by the business, while controlling consistency, governing proper contextual usage, and leveraging analytic capabilities are other challenges becoming increasingly in need of mitigation.
available experiment data with analysis tools in a user-friendly
The Integration Challenge
interface, and the ability to share data for collaboration. Further, through Data-Intensive Science Workshops (DISWs) – sponsored by the National Science Foundation (NSF) – the Data-Enabled Life Sciences Alliance, or DELSA Global, identified three top information
Having access to data – all data – is a requirement in life sciences companies, as well as a long-standing barrier. In fact, a core expectation of the scientific method – according to the National Science Board data policy taskforce -- is the “documentation and sharing of results, underlying data, and methodologies.” Highly accessible data not only enables the use of vast volumes of data for analysis, but it also fosters collaboration and cross-disciplinary efforts – enabling collective innovation within life sciences companies .
challenges amount the life sciences community. In this study, the top two challenges included 1) the necessity to integrate work across diverse domains, and 2) the need for reproducibility and analytic capabilities. The end goal, then, is to synergize research across the life sciences using contemporary computing approaches to comprehend large and diverse data.
For life sciences companies, success depends largely on reliable and speedy access to data and information, and this includes information stored in multiple formats (structured and unstructured) and research locations (on-premise, remote premises, and cloud-based). Further, there must exist the ability to make this data available: to support numerous tactical and strategic needs -- including providing correct and reliable information to doctors and patients, optimizing reoccurring and multichannel marketing activities, and improving sales force effectiveness and efficiency -- through standards-based data access and delivery options that allow IT to flexibly publish data. Reducing complexity when federating data must also be addressed, and this requires the ability to transform data from native structures to create reusable views for iteration and discovery. Ultimately, the ability to unify multiple data sources to provide researchers, analysts, and managers with the full view of information for decision-making and innovation without incurring the massive costs and overhead of physical
Key Considerations for Analytic Solutions for Life Sciences 2
Information Challenges in Life Sciences
data consolidation in data warehouses remains a primary integration challenge and the core barrier to overcome in the next-generation of life sciences data management. Further, this integration must be agile enough to adapt to rapid changes in the environment, respond to source data volatility, and navigate the addition of newly created data sets. The Management Challenge Another challenge is the guidance and deposition of context and metadata, and the sustainment of a reliable infrastructure that defines access and permissions and addresses various governance and compliance rules within the strict context of the life sciences industry. Traditional data warehouses enabled the management of data context through a centralized approach and the use of metadata, ensuring that users had wellanalyzed business definitions and centralized access rights to support selfservice and proper access. However, in highly distributed and fast changing data environments -- coupled with more need for individualized or project-based definitions and access -- the central data warehouse approach falls short and prioritizes the needs of the few rather than the many. For life sciences companies, this means the proliferation of sharing through replicated and copied data sets without consistent data synchronization or managed access rights. In order to mitigate the risks associated with data, enterprise data governances programs are formed to define data owners, stewards, and custodians with policies to provide oversight for compliance and proper business usage of data through accountabilities. The management challenges for data environments such as these are: permission to access data for analysis prior to integration, defining the data integration and relationships properly, and then determining who has access permissions to the resulting integrated data sets. These challenges are no different for data warehousing approaches or data federation approaches, however there is a high degree of risk when environments must resort to a highly disparate integration approach where governance and security are difficult – or nearly impossible -- to implement without being centralized. Management challenges with governance and access permissions are equally procedural and technological: without a basic framework and support of an information governance program, technology choices are likely to fail. Likewise, without a technology capable of fully implementing an information governance program, the program itself becomes ineffective.
Key Considerations for Analytic Solutions for Life Sciences   3
Information Challenges in Life Sciences
The Discovery Challenge A third information challenge for life sciences companies could be referred to as a set of “discovery challenges.” Within these challenges are balancing the need to enable the discovery process while still maintaining proper IT oversight and stewardship over data – or, freedom versus control – which is different than the information or management challenge in that it affects not only how the data is federated and aggregated, but in how it is leveraged by users to discover new insights. Because discovery is (often) dependent on user independency, the continued drive for self-service – or, self-sufficiency --, presents further challenges in controlling the proliferation generated by the discovery process as users create and share context. A critical part of the challenge, then, is how to establish a single view of data to enable discovery processes while governing context and business definitions. Discovery challenges go beyond process and proliferation, too, to include further challenges in providing a scalable solution for enabling even broader sources of information to leverage for discovery, such as data stored (and shared) in the cloud. Analytical techniques and abilities also bring additional challenges to consider, as the evolution of discovery and analysis continues to become increasingly visual, bringing the need for visualization capabilities layered on top of analytics. Identifying and incorporating tools into the technology stack that can meet the needs of integration, analytics, and discovery simultaneously is the crux of the discovery challenge.
Key Considerations for Analytic Solutions for Life Sciences 4
Tackling Today’s Information Challenges in Life Sciences
Choosing Data Abstraction for Unification Data abstraction through a semantic layer supports timely, critical decisionmaking as different business groups become synchronized with information across units, reducing operational silos and geographic separation. The semantic layer itself provides business context to data to establish a scalable, single source of truth that is reusable across the global organization. Abstraction also overcomes data structure incompatibility by transforming data from its native structures and syntax into reusable views that are easy for end users to understand and developers to create solutions. It provides flexibility by decoupling the applications -- or consumers -- from data layers, allowing each to work independently in dealing with changes. Together, these capabilities help drive the discovery process by enabling users to access data across silos to analyze a holistic view of data. The inclusion of a semantic layer centralizes metadata management, too, by defining a common repository and catalog between disparate data sources and tools. It also provides also a consolidated location for data governance and implementing underlying data security, and centralizes access
Further, context reuse will inherently drive higher quality in semantic definitions as more people accept – and refine -- the definitions through use and adoption. Because future life sciences innovations and data-enabled discoveries will require data to be integrated and analyzed conjointly, a secure and scalable approach to managing data that allows for exploration, analysis, and visualization is needed to address the scale and complexity of the data. The inclusion of a semantic layer, too, gives business context to raw data, establishing a reusable, single source of the truth across the organization.
permissions, acting as a single unified environment to enforce roles
Centralizing Context in the Cloud
and permissions across all federated data sources.
The growing amount of data in life sciences companies not only emphasizes the need for integration of the data, but for access and storage of the data, too. Cloud platforms offer a viable solution through scalable and affordable computing capabilities and large data storage – recent Accenture research reports that cloud computing has gone from an idea to a core capability, with many leading life sciences companies approaching new systems architectures with a “cloud first” mentality . This, however is not limited to the scalability and storage cost efficiency of the cloud, but is also influenced by the ability to centralize context, collaborate, and become more agile. Taking the lead to manage context in the cloud is an opportunity to establish much-needed governance early on as cloudorientation becomes a core capability over time. While the impact of cloud computing has been felt most acutely by the marketing and sales functions within life sciences to date, use of this technology is expanding into R&D, commercial, supply chain, and enterprise functions.
Key Considerations for Analytic Solutions for Life Sciences 5
Tackling Today’s Information Challenges in Life Sciences
With the addition of a semantic layer for unification and abstraction, data stored on the cloud can be easily and agilely abstracted with centralized context for everyone – enabling global collaboration. Several life sciences-based use cases have proven that using the cloud drives collaboration, allowing life sciences companies’ marketing, sales, and research functions to work more iteratively and with faster momentum. Ultimately, where data resides will have a dramatic effect on the discovery process – and trends support that eventually more and more data will be moved to the cloud. Moving abstraction closer to the data, then, just makes sense – it’s paving the road for future life sciences innovations. Getting Visual with Self-Service Providing users with tools that leverage abstraction techniques keeps data oversight and control with IT, while simultaneously reducing the dependency on IT to provide users with data needed for analysis. Leveraging this self-service (or, self-sufficient ) approach to discovery with visual analytic techniques drives discovery one step further by bringing data to a broader user community and enabling users to take advantage of emerging visual analytic techniques to visually explore data and curate analytical views for insights. Utilizing visual discovery makes analytics more approachable, allowing technical and nontechnical users to communicate through meaningful, visual reports that can be published (or shared) back into the analytical platform -- whether via the cloud or onpremise -- to encourage meaningful collaboration. Self-sufficient visual discovery and collaboration will benefit greatly from users not having to wonder where to go get data -- everyone would simply know to go to the one repository for everything. While life sciences companies have relied heavily on exploratory and reporting graphics across many functional areas – including research, clinical trial development, and sales and marketing – there are significant differences in these types of visualizations that impact the ability to visually analyze data and discover new insights. Traditional BI reporting graphics (the standard line, bar, or pie charts) provide quick-consumption communications to summarize salient information. With exploratory graphics – or, advanced visualizations (such as geospatial, quartals, decision trees, and trellis charts) analysts can visualize clusters or aggregate data; they can also experiment with data through iteration to discover correlations or predictors to create new analytic models. These tools for visual discovery are highly interactive, enabling underlying information to emerge through discovery, and typically require the support of a robust semantic layer.
Key Considerations for Analytic Solutions for Life Sciences 6
Selected Examples of Abstraction, Cloud , and Visualization
The examples described below require the ability to aggregate and assimilate significant amounts of both semi-structured and unstructured data within a platform that can provide accurate and concise analytics. The inclusion of a semantic layer – especially one that leverages the cloud – facilitates the continuous upgrading of information and context required for discovery and analysis. Additionally, these examples illustrate the increasing emphasis on the inclusion of a graphics user interface to provide effective data visualizations to communicate and display the discoveries earned in analysis.
Democratizing Information Through Integration In the multi-billion dollar life sciences industry – which includes medical device makers, where there is increasing adoption – the discovery of new drugs and the smooth delivery of products through wide-reaching sales and marketing channels are key to success. Yet, in all of these things, data is the key: research findings, clinical trial results, manufacturing process validation, and R&Dintensive cycles generate an extreme amount of information, in which each new data point can have a major influence. Other information, too, such as customer sentiment data is important to capture and include in analysis as patients will become increasingly more influential in the future of healthcare . To enable this, life sciences companies require access to data to enable true democratization of research and information. In a two year anonymized study, GlaxoSmithKline (GSK) used text analytics software to mine online parenting websites in an effort to understand and analyze concerns – regarding safety, timing, and comfort – that motivate parents to delay vaccinations after a measles spike in 2011. Capturing candid sentiment data directly from parents allowed the company to provide doctors with better educational materials and information to supply to parents and patients. By integrating and analyzing this unstructured data against current vaccination data, this research has helped the pharmaceutical company reconsider how it helps physicians communicate inoculation information.
Using the Cloud as a Research Enabler Many life sciences – notably pharmaceutical companies like Pfizer, Eli Lilly & Co., Johnson & Johnson, and Genetech – are demonstrating the viability of utilizing the cloud for scalability, agility, collaboration and sharing, especially
Key Considerations for Analytic Solutions for Life Sciences 7
Selected Examples of Abstraction, Cloud , and Visualization
in researchand development (R&D) processes. These efforts support the claim that moving larger and larger life science data sets into the cloud is inevitable, illustrating again the importance of moving abstraction closer to the data to enable global sharing processes – internally within organizations as well as with other third party partners -- and centralize context management. Eli Lilly launched a 64-machine cluster in the cloud to work on bioinformatics sequence information, executed the work, and shut it down the project within twenty minutesix. Lilly’s Senior Systems Analyst for Discovery IT noted that while exact cost savings are difficult to calculate, using the cloud helped to circumvent “spiky utilization” and achieve significant time and cost savings.
Visualizing Social Media Analytics for Life Sciences Research Today’s, life sciences companies’ marketing and sales teams are leveraging abstraction to integrate analytics, collaboration, and visualization to seek new ways to measure brand perception, capture customer sentiment, and gain insights into relevant competitive information. Concerns over the lack of regulatory guidance and potential liability related to adverse events and product complaints have made life sciences companies slow to embrace social media. However, today life sciences companies are adopting social media as a new, costeffective, and rich “source of information ” marketing channel that allows them to engage directly with customers and patients to measure sentiment and gather real-time market research data to improve existing products and stimulate further innovation. Project:EVO is a collaborative partnership between Pfizer and therapeutic game developer Akili Interactive Labs to design mobile video game technology used to measure cognitive differences in healthy older adults to identify early warning signs of Alzheimer’s disease. By comparing levels of amyloid (the main component of brain plaques and risk factor for developing Alzheimer’s) and performance characteristics, Pfizer hopes to identify biomarkers that could help identify at-risk populations for future clinical trials and drug development. In addition to the robust analytic capabilities and need to integrate and analyze multiple forms of data, this project uses gamification and visualization techniques to discover and communicate insights.
Key Considerations for Analytic Solutions for Life Sciences 8
Conclusion
Today’s explosion of data in life sciences companies poses challenges not only because of the sheer volume of data to store, manage, and interpret, but also in its complexity and distribution across various technology, application, and geographical silos. Further, the ability to achieve a complete view of collected data to discover insights in ways that are agile, flexible, and iterative pose additional obstacles to analysts – not to mention those obstacles posed to IT to manage governance and provide access to data in ways that satisfy data security and privacy regulations. Within the life sciences literature, navigating and understanding data has been described as “the greatest challenge to unlocking knowledge and scientific discovery.” Unlocking knowledge and scientific discovery, in this context, requires that analysts and researchers have access to complete, high quality, and actionable information in a way that is agile -- and that leverages available tools and technologies to drive analytics and discovery. By choosing abstraction for unification, embedding business context into data through the inclusion of a semantic layer, leveraging cloud technologies, and enabling business users with self-service tools that offer robust analytic capabilities including advanced visualization, life sciences companies can continue on their journey to becoming even more data-capable organizations.
Key Considerations for Analytic Solutions for Life Sciences 9
References
Accenture 2013 Technology Vision. “Every Life Sciences Business is a Digital Business.” Gregory Hather, Winston Haynes, Roger Higdon, Natali Kolker, Elizabeth Stewart, Peter Arzberger, et al. “The United States of America and scientific research.” PLoS One 5 (2010), e12203. IDC Health Insights. 2012. “Worldwide Pharmaceutical Social Media Analytics.” IDC MarketScape. 2013. “Worldwide Life Sciences Sales and Marketing ITO 2013.” iGate Research. TechConnect 4, no. 7 (n.d.). Social Analytics for Life Sciences. James Temple. 2014, Jan. 9. “Brainteaser: Can an iPad game detect Alzheimer’s?” CNBC Technology. Joel Schectman. 2013, May 1. “Glaxo mined online parent discussion boards for vaccine worries.” The Wall Street Journal, CIO Journal. Lindy Ryan. (2014). “From self-service to self-sufficiency: How discovery is driving the business shift.” Radiant Advisors, Insight Series. National Science Board. 2011. “Digital Research Data Sharing and Management.” PricewaterhouseCoopers. 2009. “Pharma 2020: Marketing the future, which path will you take?” Rick Mullen. “The new computer pioneers.” Chemical & Engineering News 87, no. 21: 10-14. Roger Higdon, Winston Haynes, Larissa Stanberry, Elizabeth Stewart, Gregory Yandl, Chris Howard, William Broomall, Natalie Kolker, and Eugene Kolker. “Unraveling the complexities of life sciences data.” Big Data 1, no. 1 (2013): 42-50.
Key Considerations for Analytic Solutions for Life Sciences 10
About the Author Lindy Ryan, Research Director, Data Discovery and Visualization As Research Director for Radiant Advisors’ Data Discovery and Visualization practice, Lindy leads research and analyst activities in the confluence of data discovery, visualization, and data science from a business needs perspective.
Sponsored by:
Birst is the only enterprise-caliber Business Intelligence platform born in the cloud. Less costly and more agile than Legacy BI and more powerful than Data Discovery, Birst is engineered with an automated data warehouse and rich, visual analytics, to give meaning to data—all types and sizes. Coupled with the agility of the Cloud, Birst gives business teams the ability to solve real problems. Fast. Life science companies are using Birst to integrate data from inside and outside their organizations, quickly finding answers to their most difficult questions, mapping out local go-to-market strategies and driving increased sales. To learn more about Birst and how it is helping life sciences organizations to think fast visit www.birst.com/lifesciences
About Radiant Advisors Radiant Advisors is a strategic advisory and research firm that networks with industry experts to deliver innovative thought-leadership, cutting-edge publications and events, and in-depth industry research. © 2014 Radiant Advisors. All Rights Reserved. Radiant Advisors Boulder, CO USA Email: info@radiantadvisors.com To learn more, visit www.radiantadvisors.com.