6 minute read
Open Source: How a Data Scientist Is Bringing Precision Medicine to the People
In some ways Elizabeth Worthey, Ph.D., is a typical nerd. She likes having lots of computer screens and solving puzzles and nothing so much as coming up with novel ways to use software to solve complex problems.
Unlike most geeks, though, Worthey is intimately familiar with her own source code. A chunk of her laptop hard drive is dedicated to a complete copy of her DNA, which she had sequenced nearly a decade ago. ‘I had them all’
Advertisement
When she hears about an interesting new deleterious variant at a genetics conference, “I go in and look at my genome and see if I have it or not,” said Worthey, who joined UAB in July as director of the Bioinformatics Section in the Division of Genomics Diagnostics and Bioinformatics in the Department of Pathology, director of the Center for Computational Genomics and Data Sciences in the Department of Pediatrics, and the associate director of the Hugh Kaul Precision Medicine Institute, all in the School of Medicine.
Worthey hails from the Vale of Leven, in between Loch Lomond and the River Clyde on the west coast of Scotland, an area where heart problems run rampant. “Nobody in my family has cardiovascular disease, though,” she said. “I was at a conference and they put up a list of protective genetic variants. I looked and I had them all.” The original genomic miracle
She knows from personal experience that many people aren’t so fortunate. In 2009, Worthey was part of a team at the Medical College of Wisconsin that was the first to solve a medical mystery with precision medicine. Worthey and her team created a unique software program, CarpeNovo — Latin for “seize the new” — that identified the ultra-rare genetic mutation responsible for 4-year-old Nicholas Volker’s devastating illness. With this crucial information, clinicians were able to identify a treatment (bone marrow transplant) that saved the boy’s life and resulted in a Pulitzer Prize-winning series and book about the case by reporters at the Milwaukee Journal Sentinel. The Volker case was the first in the world to demonstrate the power of genomic sequencing and analysis in patient care, but the costs and time involved in sequencing and analysis meant these methods could only be used in extraordinary cases.
“That first clinical case took three months, and fortunately the child was healthy enough to wait that long. Many times they are not. We developed some of the first methods that allowed shortening of that timeframe down to where we are today, which is being able to do sequencing and analysis in less than a week or a couple of days in some cases.” Later, at the HudsonAlpha Institute for Biotechnology in Huntsville, Worthey and her team developed new software tools that have dramatically reduced analysis time. “That first clinical case took three months, and fortunately the child was healthy enough to wait that long,” Worthey said. “Many times they are not. We developed some of the first methods that allowed shortening of that timeframe down to where we are today, which is being able to do sequencing and analysis in less than a week, or a couple of days in some cases.” Making the miracles routine
Worthey’s mission is to help open up genetic insights as a routine part of clinical care at UAB. She is working with Alexander “Craig” Mackinnon, M.D., Ph.D., the inaugural director of the Genomic Diagnostics and Bioinformatics division in Pathology, to support the Precision Diagnostics Laboratory, which will combine and enhance efforts across the hospital. Genetics, “
Pediatrics, Pathology and other departments at UAB are all increasingly using genomic sequencing as a routine part of patient care, explained George Netto, M.D., the Robert and Ruth Anderson Endowed Chair in the Department of Pathology. “So it makes sense to bring all of that under one roof,” Netto said.
A single whole genome sequence consists of 3.2 billion DNA letters and takes up half a terabyte of hard-drive space. The first step in genomic-based medicine analysis is to align those billions of letters against a reference genome. Then software tools are used to identify the 6 million or so differences, or variants, that any one person will have compared to another. Most of these aren’t linked to a disease or are otherwise benign. In fact, only one or two are likely to be responsible for the patient’s primary problem.
“A big part of precision diagnosis is interpretation and computation, looking at the patient’s code and filtering out what is abnormal from what is normal,” Netto said. “That is just as important as what kind of ‘Cadillac’ sequencing machine you have. Liz built the tools for that at HudsonAlpha, and she is doing the same here.” The challenge keeps getting bigger, Netto added. UAB now sends out more than a thousand patient samples for sequencing each year, and that number will only grow, he said.
Sorting out the squishy bits
In some critical cases, turnaround times of many days aren’t fast enough. “Ideally, we want to be able to get it down to under 24 hours,” Worthey said, “and in my lab we are exploring ways we think we can do that.”
Speeding up the process requires more than faster hardware and better code, however. “There’s also the squishy human bit,” she said. “Think about the NICU. The process of sequencing and software with the output being a clinical report is one thing. But who reads it? Is it a neonatologist? If they have questions, who do they call? I can do the first two and help define the process. But you have to actually have a health care system to put it into health care. That’s why I’m at UAB.”
Figuring out how to incorporate genomic data into patients’ electronic health records is also a top priority for James Cimino, M.D., director of UAB’s Informatics Institute. Worthey will be collaborating with Cimino’s team on the code and protocols needed to make that happen.
Elizabeth Worthey, PhD Tools for lots of people
She has seen how useful this data can be. After her daughter was born a few years ago, doctors diagnosed Worthey with an autoimmune thyroid disease. A quick search of her genome identified the mutation likely responsible. “If I had had in my medical record the note, ‘Has a thyroid stimulating hormone variant known to confer significantly risk of autoimmune thyroid disease,’ I could probably have gotten a diagnosis 10 years before I did, which would have been very helpful,” she said.
Having access to her entire genome can be a little frightening, too, Worthey noted. “I have two little kids,” she said. “I think, ‘Please don’t let me find anything terrible — cancer or Alzheimer’s.’ It makes you very aware of the issues.” But she is not dissuaded from the act of looking. Worthey has used her genome as test data, or when she needs an illustration during public speaking engagements. “I probably have one of the most studied genomes on the planet; by me,” she said. Curiosity is the hallmark of a scientist, and Worthey’s enthusiasm is infectious. She brought seven lab members with her from HudsonAlpha and will be hiring more — mainly software developers with computer engineering backgrounds but also data scientists and research scientists trained in interpreting molecular variation. “Most of these folks have worked in other industries before they came to the light side,” she said with a laugh. “We’re trying to develop tools that are commercial-grade, that are designed to be placed in the hands of lots of people.”
A place for collaborations
Worthey intends for her Center for Computational Genomics and Data Sciences “to be a place for collaborations, where people with clinical and translational questions about patient populations can come to find a bunch of folks who know how to analyze that data,” she said. Her team is involved with the Alabama Genomic Health Initiative and the Undiagnosed Diseases Program, both led by UAB Chief Genomics Officer Bruce Korf, M.D., Ph.D. They