3 minute read
The digital dependencyof genomics
Genomics is the study of the genome, an organism's complete DNA set, and the interaction of genes with one another and their environment. Unlike genetics, which studies individual genes, genomics is interdisciplinary and focuses on the collective characterisation and quantification of all the genes in an organism.
DNA sequencing highlights the genetic information carried by a particular gene, while genomic sequencing provides information on genetic variations that contribute to the development of disease.
Sync with Big Data and data analytics
Genomics data is a great example of the size and complexity of Big Data. In a perfect world, the entire genome can be encoded in about 700MB to 800MB of data. However, in the real world, it can require up to 200GB of storage to sequence the whole human genome.
The growth in genomics with its increased use of nextgeneration sequencing results in an exponential need to leverage Big Data analytics to identify clinically actionable genetic variants for precision genomic medicine. While integrating diverse genomic data with electronic health records poses challenges, it also provides an opportunity to develop an effective and efficient approach to identifying actionable genetic variants for personalised diagnosis and therapy.
AI/MLin genomics
A key challenge to incorporating genomic data is the lack of standards for NGS data generation, data sequencing/processing, data storage, and clinical decision support. Due to the frequent evolution of tools in NGS technology, it has been hard to establish standards. A lack of standards has led to difficulty in interoperability regarding data quality. These data management and analysis challenges can be overcome using AI/ML algorithms.
AI/ML also promises to simplify and speed up genome interpretation by integrating predictive methods. This opens a whole range of possibilities in fit for analysing genomic datasets quickly without maintaining and upgrading servers. Simply put, the pay-as-you-go model the cloud offers flexibility to scale up and down the computational power and storage as needed. This flexibility in computing is desirable for small and medium businesses in life sciences.
Despite the clear utility of the cloud to genomics, wide adoption of the technology is yet to be seen. The primary reason for the lack of adoption is the hesitancy of businesses regarding long-term costs. While storage is charged per gigabyte, the cost of computational power in the cloud can sometimes be five times the cost of onsite computation.
As more players come to the market, prices will become more competitive, and cloud technology in genomics will be here to stay.
The securityangle
policy development, the cybersecurity community will play a significant role in thwarting attacks and maintaining the integrity of genomic data and systems.
The digital technologies mentioned above are critical to the success of the genomics industry and its impact on the world of medicine. Organisations that offer services in Big Data & Analytics, Artificial Intelligence/Machine Learning, cloud computing, and cybersecurity with expertise in genomics and life sciences will be uniquely positioned to be instrumental in the future of the life sciences industry. The rapid growth and stabilisation of digital technologies is a precursor to a transformation in the application of genomics that will change every individual's life within the next few years.
terms of predictive risks of diseases, not limited by generations. At the next level, this shall help identify the precise medical interventions that can prevent or treat such ailments. While the use of AI/ML is very promising, it should be approached with caution as its analytical insights and results will have direct clinical impacts on patients.
Ubiquityof cloud
Cloud computing is the perfect
As the British mathematician Clive Humby says, "data is the new oil." And just like oil over the past century, players across the spectrum, both public and private, are vying for data. The more complicated the data is to gather, the greater demand for it in the market. Genomics data is a perfect example of this making it a target of cyber threats. A potential attack on genomic data can significantly affect the confidentiality, integrity, and availability of the relevant systems.
According to the National Center for Biotechnology Information, some possible attack scenarios include biological substance attacks, malicious hardware/firmware implantation, and NGS software compromise. Along with strong
India has a tremendous ecosystem and significant support from government organisations. The Genome India Project, initiated by the Department of Biotechnology, is an example of the Indian State's interest in this field. Similarly, the IndiGen Project founded by the Institute of Genomics and Integrative Biology under CSIR can be bolstered through AI-based solutions. With multiple use cases for genomics in public health, such as cancer research and precision medicine, the digital ecosystem, along with India's genomic labs, can be leveraged to advance such genomics projects. Organisations in India that provide an end-to-end array of services for life sciences, from genomics labs to digital services through domain experts, will be the game changers for India's march toward leadership in genomics.