SPECIAL ADVERTISING REPORT
SUMMER 2021
Artificial Intelligence Meets Human Creativity UF’s HiPerGator AI Supercomputer is an Unprecedented Research Tool
John Jernigan
Summer 2021, Vol. 26, No. 2
About the cover: Erik Deumens, UF’s director of research computing, has led the team building HiPerGator AI, the nation’s most powerful university-owned artificial intelligence computer. The computer, pictured on the cover and on this page, is powered by 140 NVIDIA DGX A100 processors.
HiPerCollaborations
4
World-leading AI capabilities at UF are positioning Florida’s economy for the future
Kent Fuchs President David Norton Vice President for Research
Data Prospecting
6
Computing power and data drive artificial intelligence advances
Vital Signs
12
Leveraging data for better health outcomes
Trusting Tech
18
Artificial intelligence can combat deepfakes, cybercrimes and snooping
Capturing a World of Data
24
Computing power is the key to analyzing a changing environment
All-Seeing Algorithms
30
Building ethics into artificial intelligence systems
Board of Trustees Mori Hosseini, Chair David C. Bloom David L. Brandon Cooper L. Brown Richard P. Cole Christopher T. Corr James W. Heavener Daniel T. O’Keefe Thomas G. Kuntz Rahul Patel Marsha D. Powers Fred S. Ridley Anita G. Zucker Explore is published by UF Research. Opinions expressed do not reflect the official views of the university. Use of trade names implies no endorsement by the University of Florida. © 2021 University of Florida. explore.research.ufl.edu Editor: Joseph M. Kays joekays@ufl.edu Art Director: Katherine Kinsley-Momberger Design and Illustration: Katherine Kinsley-Momberger Ivan J. Ramos Writer: Cindy Spence Photography: John Jernigan Jesse Jones Tyler Jones Web Editor: Karla Arboleda Copy Editor: Bruce Mastron Printing: RR Donnelly, Orlando
Extracts
34
News briefs
Member of the University Research Magazine Association www.urma.org
HiPerCollaborations World-leading AI capabilities at UF are positioning Florida's economy for the future
W “Our already collaborative culture on campus has shifted into overdrive as scientists and scholars in all disciplines pursue novel ideas.” — David Norton
Vice President for Research
4 Summer 2021
hen I took over as vice president for research in 2012, the University of Florida was just embarking on an exciting technological journey that has been a key driver in our scientific excellence and output. UF was building a $15 million Data Center to house HiPerGator, a $3.4 million high-performance computer that came online in 2013 and established a foundation for generations of discovery. Computing and research have grown hand in hand as HiPerGator, now in its third generation, made powerful data science tools available to all faculty and many students, contributing significantly to UF’s $942 million in research spending last year. The work of the last decade set the stage for UF’s giant next step. In 2020, UF alumnus Chris Malachowsky and NVIDIA, the company he cofounded, gave UF cutting-edge processing tools and training to build the most powerful university-owned artificial intelligence computer. Supercomputers are generally measured by CPU speed, but with AI it is graphical processing units, or GPUs, that do the heavy lifting and NVIDIA is the world leader in the development of GPUs. HiPerGator AI currently employs 140 of NVIDIA's newest, most powerful GPUs. With the launch of HiPerGator AI in January, UF’s opportunity to be a leader in computation, data science and AI had arrived. Faculty and students now have access to the best AI technology on the planet, and we are committed to seizing this opportunity to the fullest. Building on this foundational gift, UF has continued to grow its AI investment with millions of dollars of additional support from the state and from other donors. The AI initiative will have ripple effects well beyond the UF campus for artificial intelligence research, education and innovation.
SPECIAL ADVERTISING REPORT
The response to this opportunity from the UF faculty across disciplines has been incredible. Researchers with existing AI expertise are accelerating their programs at warp speed. Others whose applied research areas are wellsuited for these approaches are pivoting toward AI. Last year, UF Research created a $1 million AI Research Catalyst Fund and issued a call for proposals to fund up to 20 projects. It drew 133 new AI ideas, all worthy of pursuit, spanning disciplines from the College of the Arts to the College of Engineering. With a recent $20 million-a-year allocation from the Florida Legislature, we are now recruiting 100 new faculty with AI expertise. AI is being infused across our curriculum and throughout our research laboratories. Our already collaborative culture on campus has shifted into overdrive as scientists and scholars in all disciplines pursue novel ideas. For
example, Alina Zare and her team in the Machine Learning and Sensing Lab collaborate across more than a dozen disciplines. Daisy Wang’s Data Science Research Lab helped organize a coding and machine learning competition with the goal of producing open-source tools that up every researcher’s data science game. At UF Health, a collaboration of medical and data scientists and NVIDIA experts put HiPerGator AI through its early paces to launch GatorTron™, a model for extracting health insights from medical records. The model was developed using 10 years of anonymized medical data — up to 90 billion words of clinical notes — in just seven days. The collaborations extend beyond campus, too. UF is sharing its computing wealth to make equity a cornerstone of the AI Initiative. For purposes of research, UF invited other institutions in the
State University System and the 19 institutions in the Inclusive Engineering Consortium, a group of historically minority-serving institutions, to use HiPerGator AI. Then, we invited our Southeastern Conference colleagues and 100 other institutions to use HiPerGator AI for teaching. The combination of an agile AI workforce and AI research is transforming the future. By the end of the decade, the AI economy will represent over $10 trillion in economic output. The World Economic Forum estimates AI will result in a net gain of 12 million jobs by 2025. UF is providing a whole generation of workers the skills to seize these new opportunities. The AI Initiative promises to change the academic and research landscape at the University of Florida in the 21st century. It is exciting to be a part of UF’s continued rise to the top.
Explore 5
6 Summer 2021
Computing power and data drive artificial intelligence advances John Jernigan
By Cindy Spence
Erik Deumens
Not so long ago, a scientist might say she could never have too much data. Even today, in a world drowning in data, it is better to be data-rich than data-poor. But data is not knowledge. Although data has been called the new oil, a precious resource, finding the relevant in the midst of the irrelevant is a task too big for mere mortals. It takes supercomputing, says University of Florida research computing Director Erik Deumens, to turn data into knowledge. “That’s where artificial intelligence comes in,” Deumens says. “We’ve been in the age of big data for 20 years now, and the problem is an overwhelming amount of data. How do we sort it, find the correlations, figure out what it all means? Sometimes the data is so complicated and diverse, the human brain just cannot grasp it.” For AI, however, data is a feast, and supercomputing makes it possible. UF was already in the forefront of supercomputing with three generations of its high-performance HiPerGator computer. But a gift in 2020 from UF alumnus Chris Malachowsky and NVIDIA, the company he helped found, provided a massive computing boost, and the combination of Hi-PerGator with NVIDIA DGX SuperPOD™ architecture created one of the most powerful supercomputers in all of higher education. With supercomputing, science — and knowledge generation — can go faster, Deumens says. Science is full of trial and error and a lot of waiting. Scientists put an experiment in motion, wait to see if it works, then try again if it fails.
“We’ve been in the age of big data for 20 years now, and the problem is an overwhelming amount of data. How do we sort it, find the correlations, figure out what it all means? Sometimes the data is so complicated and diverse, the human brain just cannot grasp it.” — Erik Deumens Deumens uses an example from chemistry, one of the departments with which he is affiliated, where chemists often do calculations that take several days. “Then at the end, they get one number, the energy of their chemical reaction, and they’re very excited about that one number,” Deumens says. With supercomputing, however, instead of doing one calculation that takes several days and produces one number, chemists can ask the computer to do multiple calculations to look for the best number. “The AI goes away and does all these calculations, then says this is the best number. In the past when people tried it, it wouldn’t work, but now it gives better numbers faster than any human can. AI gives better, more accurate results. The process gets accelerated by applying AI.” The more data we collect, the more important it is to make sense of it, says Robert Guralnick, the informatics curator at the Florida Museum of Natural History. Explore 7
John Jernigan
Alina Zare says collaborations are a joy because the teamwork advances both her field and the fields of her collaborators.
Alina Zare
“We cannot human-power our way through data to make sense of the globe. Right now, we’re gathering beyond petabytes of data,” Guralnick says. “We need to speed up our ability to extract the knowledge we need from these streams of data. We need to derive knowledge in a timescale where we can make relevant decisions.”
AI Connections When you ask around campus who scientists are collaborating with on the AI and machine learning front, one name crops up frequently: Alina Zare. Zare’s Machine Learning and Sensing Lab stays busy developing algorithms to automate analysis of data from a wide range of sensors, including ground-penetrating radar, LiDAR and hyperspectral and thermal cameras. In the lab, two post-docs, 17 Ph.D. students, two master’s students and a cadre of undergraduates work on projects with collaborators from agronomy, psychology, the Florida 8 Summer 2021
Museum, horticulture, entomology, ecology and a host of other computer scientists, both on campus and at other institutions. Collaborations, Zare says, are a joy. In collaborative work, the convergence of different viewpoints uncovers what is really essential about a problem or dataset, and the teamwork advances both her field and the fields of her collaborators. Zare, a professor of electrical and computer engineering, uses supercomputing in her work on supervised learning for incomplete and uncertain data. For 19 years, Zare has worked on explosive hazard detection with funding from the Army Research Office and the Office of Naval Research, developing algorithms for sensors to detect underwater explosive devices and landmines. Sensors help humans see more than they would be able to see with their eyes, Zare says, such as root systems and buried landmines. Former agronomy department
Chair Diane Rowland, who was recently appointed dean of the College of Natural Sciences, Forestry, and Agriculture at the University of Maine, says AI was not part of her research on peanuts until she began collaborating with Zare four years ago. The machine learning capabilities she gained are helping in work to detect peanut toxins that may hide underneath the hulls of harvested peanuts and research on observing root growth without damaging plants. “Without that collaboration and the gift of the NVIDIA computing power, a lot of this work isn’t possible,” Rowland says. “You can gather all the data in the world, but unless you can analyze it in complex ways, you lose some of the potential answers the data could provide.” Zare says her lab relies heavily on NVIDIA GPU systems and especially the new DGX A100 system. The openness of datasets also is important, Zare says. The code and
who share similar interests. And in the fall, a hackathon is scheduled — with access to IBM’s Watson — to use realworld datasets to tackle climate change and environmental problems. Vice President for Research David Norton says UF is trying hard to make AI available to anyone with a good AI idea. “At the end of the day, you can have a great machine, great infrastructure, but if you don’t have the talented people, the faculty, you’re not going to realize gains,” Norton says. “People that have not really done AI before are now jumping into it, and we’re making that possible. “We’re in a position that would have been unimaginable four years ago.”
Data Center
methods for all her projects is shared broadly, the better to advance machine learning while advancing other sciences as well. Deumens says the most telling example of the collaborative spirit came during the first round of AI Catalyst awards in the summer of 2020. The awards, sponsored by UF Research, are seed funding, a leg up to get good ideas off the ground and generate proof of concept that can attract support from national agencies. Usually when seed funding is announced, it might attract 40 applications, says Deumens, who sits on the review committee that distributed the $1 million fund. “Last year, we had 133 applications for 20 awards,” says Deumens, “but there were 75 additional proposals that were considered worthy. And almost all of them were cross collaborations with multiple departments to try and do something really new and innovative.” The 75 additional proposals didn’t get the catalyst funding, but they did
get allocations of time on HiPerGator for their research. “That made clear to me that this campus has a lot of really smart people, and they’re ready and willing to work together to solve really interesting problems in all fields,” Deumens says. “From religion to motion detection in medicine to agriculture to space, you name it.” And the collaborations extend beyond UF to other campuses and industries. Electrical and computer engineering Associate Professor Damon Woodard, director of AI partnerships, said UF has established a partnership with the Inclusive Engineering Consortium, which includes 19 historically minorityserving institutions. Also in the works is a partnership to connect NVIDIA researchers to UF researchers to work on common research problems. Industry days are also planned to bring companies to UF, where their researchers can meet UF researchers
When the UF Data Center opened on the East Campus in 2013, the foundation was laid for the gift from Malachowsky and NVIDIA that would come seven years later. The Data Center is connected to the main campus via two 100 gigabit per second links in two separate multi-fiber pathways for redundancy. Three generations of HiPerGator have called it home, and it had room to grow. The investment in the data center, Deumens says, put UF in a unique position to be able to even accept the gift from Malachowsky and NVIDIA. “I worked for IBM as a consultant for some years, and one of the things that would happen is IBM would donate a machine to a university, the supercomputer of the day, worth millions of dollars, and six months later not a single job had run on it,” Deumens says. “So I worked to train IT people to use the machine. “A machine that is $60 million, massive computing capacity, a machine people dream of … you can’t just give that and flip a switch. “When you add 1.6 megawatts of electrical power, do you have the people who can negotiate these contracts?” Explore 9
UF did, and spent $15 million upgrading the electrical and cooling capacity, Deumens says. On top of that, Deumens says, much of the work took place during COVID. Many people worked miracles with supply chains to get the racks and pipes and engines and get them all in time to accept the NVIDIA system. The Research Computing staff, too, seasoned by nearly a decade with HiPerGator, knew how to deal with the almost immediate computing demand. Researchers interact through four login nodes, where at any one time, 400 people are active per node. Hundreds of thousands of jobs a day are processed. In just the first months of this year, the number of accounts rose from 3,500 to 4,000. Other state institutions have been invited to use the computing capacity for research, and SEC institutions and a list of about 100 other universities nationwide will be able to use it for education. And a collaboration between UF scientists and engineers at NVIDIA has already produced GatorTron™, which crunched data from 10 years of medical records for 2 million patients and 50 million patient interactions to come up with a medical research tool — in just seven days. Deumens said GatorTron™ is a kind of pilot that shows the power of AI. “It was an interesting scientific problem, and it required a machine this powerful,” Deumens says. With 100 faculty hires in AI on the table for the near future, the investment in AI and supercomputing is paying dividends. “Not everybody can take home a gift like this and be successful at it,” Deumens says.
Intelligent Machines As the future of AI unfolds, Zare says one of the most important responses from a machine might be three little words: I don’t know. “We as humans are pretty good at saying ‘I don’t know.’ But most AI systems right now are not,” Zare 10 Summer 2021
says. “They always give an answer. If you ask, ‘Is this a pine tree or an oak tree, and you show it a picture of an elephant, it’s going to say, ‘That’s an oak tree,’ for sure. “Coming up with systems than can say, ‘I don’t know’ is a big challenge and an important one.” For example, a soldier walking behind an explosive hazard detection system would be safer if a machine said “I don’t know” when it encounters a potentially explosive object, rather than taking a 50-50 guess. In medicine, where AI is on the rise, uncertainty is an important element in diagnoses. If an AI system encounters symptoms it does not expect, it would be better for it not to guess. Zare says the key is to train the system to set aside the inputs it can’t reliably process, so that humans can intervene. In 50 years, Zare predicts, machines will still need humans to
guide them, even as they process more data than ever before. That guidance, she says, can come in many forms. For example, researchers are working on a physics-guided AI in which knowledge about the world can be embedded into the AI system. “I think we’re really, really far away from having systems that can transfer knowledge from one problem to another, which is something we’re pretty good at as people,” Zare says. José Principe, the director of the National Science Foundation-funded Center for Big Learning, says achieving human-level intelligence using engineering and computer science methodologies is a realistic challenge that will just take time. He agrees with Zare that despite advances, computer-human interaction should be synergistic. “We know how to teach computers to use data to make decisions in
“We still don’t know everything the brain does. We have all these fears of things that don’t exist and probably never will.” — José Principe
“We need to keep social good in the forefront of AI. For me, a social good is clean water, a healthy environment and a playing field that allows success to happen. We need holistic problem-solving.” — Robert Guralnick well-defined domains. We can teach machines to recognize images, speech and sound. But there are many other areas that are beyond the current capabilities of AI,” Principe says. One example of computers making decisions is self-driving vehicles, which
José Principe
encounter uncertainty and conflicting evidence. In an ideal world, Principe says, autonomous vehicles would be tested in a virtual reality environment where they could crash with no consequences, and learn from mistakes. “Unfortunately, we don’t have simulators for the real world. At a certain point, researchers decide they have enough confidence in their algorithms to take them into the field. And we are exactly at that point, right? Some things
work very well, but others the machine cannot predict. As long as the machine is better than the average driver, society probably gains something.” Principe started his career applying technology to medicine. Today, he’s done a 180-degree turn, and uses biology to inform technology in his Computational NeuroEngineering Laboratory, where he works on brain-machine interfaces, among other things. He is working on creating mathematical frameworks that represent cognitive processes so that they can be programmed in computers. “In the past seven years, I’ve learned quite a bit about cognitive science, and I’m fascinated by the way the brain organizes past experience.” Still, he says, science is a long way from mimicking human intelligence, so fears about AI, today at least, are unfounded. “We still don’t know everything the brain does,” Principe says. “We have all these fears of things that don’t exist and probably never will.” From the earliest technologies — the bow and arrow, for instance — Principe says humans have wrestled with sets of societal rules for controlling their use. AI is no different.
“There are things we can do that perhaps we shouldn’t do,” Principe says. “My concern is that the speed of technology innovation is exponentially increasing, but our societal decision making is not.” Deumens agrees and says students and researchers will benefit from UF’s interdisciplinary approach to AI, which goes beyond feats of engineering. “UF is trying to educate everybody about all aspects of AI, so that when our students graduate, they are not necessarily computer scientists, but they understand what AI is and how to use it ethically,” Deumens says. Guralnick points out that AI can build into algorithms the very biases that have been problematic for society for a long time. That makes it important to be thoughtful about the ethics of AI, even as we embrace the technical advances. Computer scientists can’t tackle AI alone. “We need to keep social good in the forefront of AI. For me, a social good is clean water, a healthy environment and a playing field that allows success to happen. We need holistic problem-solving. “There is not just one problem to solve in our society, there are hundreds, and they’re all interconnected,” Guralnick says. “Everyone’s sort of solving their piece, but what is so cool about the AI initiative at UF is that there’s enough people all recognizing the interconnectedness. AI as a partnership asks how we can solve many problems in a connected way across our university. “I’m so proud of the institution for recognizing that this isn’t just about computer science.” Erik Deumens Director of Research Computing deumens@ufl.edu Alina Zare Professor of Electrical and Computer Engineering azare@ece.ufl.edu José Principe Professor of Electrical and Computer Engineering principe@cnel.ufl.edu
Explore 11
VITAL SIGNS Leveraging data for better health outcomes By Cindy Spence
12 Summer 2021
Azra Bihorac says one of the most important collaborations for doctors and nurses in the future will be with the computer at a patient’s bedside. Computers will become as important as stethoscopes in hospitals and clinics, says Bihorac, whose research at UF Health focuses on artificial intelligence and emerging technologies for the care of critically ill patients. “Emerging technologies provide an enormous opportunity for physicians to be able to collect data and process it on the fly in an intelligent way,” says Bihorac, the R. Glenn Davis Chair in Clinical and Translational Medicine. “To process every clinical data value for every patient, for the human brain, that is too much. But using the capacity of AI, we can synthesize the data and come to an intelligent conclusion.” Computers instructed by precise algorithms will unlock data now buried in electronic health records, report readings from sensors that monitor patients, and record real-time data from more traditional tools like thermometers and stethoscopes and blood pressure cuffs, organizing the data onto a dashboard for everyone on a care team to see.
“To process every clinical data value for every patient, for the human brain, that is too much. But using the capacity of AI, we can synthesize the data and come to an intelligent conclusion.” — Azra Bihorac Technology won’t replace doctors — it’s not a competition between people and machines, Bihorac says — but it will help doctors deliver better care. And doctors, she says, are ready. The advent of electronic health records brought the advantage of portability, but also created an avalanche of information nearly impossible to process. Doctors and nurses have long clamored for a way to navigate the labyrinth of data to extract information they need without having to wade through masses of data they don’t need. “Here’s the data, in these records, but how do you meaningfully use that data?” Bihorac says. “Doctors are looking for that next step.” That next step is on the way in the form of GatorTron™, a first-of-its-kind technology designed to mine electronic health records in a new way, says William Hogan, the director of biomedical informatics and data science in the UF College of Medicine.
GatorTron™ is a collaboration between UF and NVIDIA, an AI partnership that began with a gift to UF of an NVIDIA DGX SuperPOD AI supercomputer, the nation's most powerful university-owned artificial intelligence supercomputer. A team of researchers from UF and NVIDIA worked to develop GatorTron™, using 10 years of anonymized medical data from more than 2 million patients and 50 million patient interactions with UF doctors. By combining forces — research expertise, patient data and computing power — the UF/NVIDIA team came up with a model for GatorTron™ in just seven days. “What we’ve been able to accomplish with GatorTron™ that’s never been done before is build a model using clinical notes — up to 90 billion words,” says Hogan, one of the leaders of the research effort. Explore 13
OneFlorida That big data and health care marry well is no surprise to Betsy Shenkman, who was a big data scientist before the term “big data” was popular. Early in her career, she mined data from insurance companies to find commonalities in the records that would yield insights for health care. She had mentors who tried to talk her out of this line of research and mentors who encouraged it. Today, Shenkman oversees really big data. She is the director of the OneFlorida Clinical Research Consortium, a resource with health data for 16 million Floridians, and countless Floridians who have benefitted from OneFlorida research are likely glad she listened to the mentors who encouraged her appetite for data.
And she’s seen how deploying data for research can pay off. Among the dozens of studies: • Two UF researchers are using OneFlorida data to evaluate the benefits, risks and cost-effectiveness of lung cancer screening with low-dose computed tomography, particularly the long-term effects of false positive diagnoses, with a $1.4 million grant from the National Cancer Institute. • A team of UF researchers made up of experts on aging, neurology and the brain, are working on a pilot study on successful aging. What they found in the OneFlorida data surprised them: 45,000 Floridians age 90 and older are “superagers,” meaning they are free of Alzheimer’s, dementia or stroke, and live independently with few hospitalizations or emergency room visits. That is out of a population of more than 200,000 people over 90 in the third most populous state. • Researchers used 600,000 patient records for a study of antibiotic use in the first two years of life to examine any relationship between antibiotic use and weight gain later in childhood. The study showed an association between the two, adding to the body of evidence that physicians need to be careful about prescribing antibiotics to very young children.
Jesse Jones
“Yes, the data are important. Yes, the computing power is important. But most important is the people and their health.” — Betsy Shenkman
Betsy Shenkman
Although Shenkman has seen what big data can do, it’s about more than data. “OneFlorida was not created just for the sake of having data,” says Shenkman, chair of the health outcomes and biomedical informatics department. “OneFlorida really was created to improve the health of Floridians and to contribute to knowledge nationally about how to improve the health of adults and children and older adults throughout the United States.” 14 Summer 2021
Shenkman says the volume of records helps researchers, but so does the quality of the data. Within the OneFlorida Data Trust is information on diagnoses, medications, procedures and some demographics, all anonymized to protect patient confidentiality. The trust is overseen by the OneFlorida Clinical Research Consortium and the UF Institutional Review Board. Because it represents 4,100 providers across 1,340 practices and 24 hospitals and patients in all 67 counties, the diversity of the records is also a key benefit for researchers. “Most clinical trials today are conducted with people from predominantly European descent,” Shenkman says. “The beauty of OneFlorida is that because we have such great coverage throughout the state, we have great diversity in terms of the race and ethnicity of the people whose data are in the Data Trust. We have great diversity of age, we have people living in urban areas, rural areas. “Our goal is to include more diverse populations in our studies, so we have a better understanding of what works for patients who are African American, what works for patients who are Hispanic or Latino,” Shenkman says. “Yes, the data are important. Yes, the computing power is important. But most important is the people and their health.”
OneFlorida Clinical Research Consortium A sample of studies using Florida health data A study on the benefits, risks and costeffectiveness of lung cancer screening with low-dose computed tomography.
OneFlorida has gathered the data that makes it possible to find the right cohorts for clinical trials and studies, and now GatorTron™ will speed up the process, Hogan says. Before GatorTron™, assembling a group of patients to enroll in trials or studies could take months because of the painstaking effort of extracting the right patients from the database. GatorTron™ cuts that time frame to minutes. The natural language processing model developed for GatorTron™ is the first clinical model of its scale in the world. Duane Mitchell, director of the Clinical and Translational Science Institute and assistant vice president for research, says GatorTron™ can read medical language and mine data at a speed humans can’t replicate, and that means faster clinical trials and results. “One of GatorTron™’s strengths is that it is much more adept and able to read and retrieve medical information with uncommon speed and accuracy,” Mitchell says. “This takes advantage of the computer power and rich medical data that UF has available.” Organizing data in a way that makes sense for patient care is important, too, Bihorac says. She and her team in the Precision and Intelligent Medicine Partnership — a multidisciplinary group of researchers in data science, AI and informatics — are looking at ways computers can reduce bias, error and preventable harm in patient care. Developing tools that can be used in real time is important for patient care, Bihorac says, and one such tool her team has developed is DeepSOFA. SOFA stands for sequential organ failure assessment, and traditional SOFA yields a mortality prediction score based on the status of six organ
A pilot study on successful aging that has revealed that 45,000 of the 200,000 Floridians aged 90 or older are "superagers."
systems. DeepSOFA is powered by an algorithm developed by Bihorac’s team and uses deep learning to process data and discover trends in patient conditions. To evaluate DeepSOFA’s accuracy, the team analyzed data from more than 85,000 former patients at UF Health and a hospital in Boston. DeepSOFA consistently outperformed SOFA. The traditional SOFA model tended to underestimate the severity of illness and predict relatively low chances of death for both survivors and non-survivors. DeepSOFA also outperformed the traditional model in a single-patient case, in which a female patient was admitted to a hospital and later died from an obstructed blood supply to a lung. Two days after the patient was admitted, DeepSOFA predicted a 50 to 80 percent probability of death, compared with a 5 percent prediction by the traditional SOFA model.
A study of 600,000 patient records that has shown an association between antibiotic use in the first two years of life and weight gain later in childhood.
In the final five hours before death, DeepSOFA estimated a 99.6 percent chance of mortality compared with 51.5 percent for traditional SOFA. “DeepSOFA provides a quick summary of the risk of complications and why the computer thinks those patients might have complications,” Bihorac says. “It’s a really intelligent snapshot of a patient’s condition.” It is the first time that deeplearning technology has been used to generate patient viability scores, the researchers said. The next step is to integrate DeepSOFA with electronic health records in real time.
“What we’ve been able to accomplish with GatorTron™ that’s never been done before is build a model using clinical notes — up to 90 billion words.” — William Hogan
Jesse Jones
From Research to Care
William Hogan
Explore 15
Medicine has a host of intractable problems with solutions that advance incrementally, if at all. Researchers hope AI will provide the fuel for long-sought movement. To jump-start AI research on all fronts, UF Research provided $1 million in seed funding for new ideas through the Artificial Intelligence Research Catalyst Fund. Among the 133 proposals, 20 were funded, including one on Alzheimer’s disease (AD) research by materials science and engineering Professor Juan Claudio Nino and psychiatry and neuroscience Assistant Professor Marcelo Febo. The two are collaborating on a means of using AI to detect the early onset of AD. AD has been at the center of several decades of focused clinical and genetic research, yet much of the evolution of the disease remains unknown. Nino and Febo are attacking the problem from two fronts. From an engineering viewpoint, Nino is using his expertise in brain-inspired network architecture and connectomics. From the medical side, Febo is deploying 17 years of work in functional MRI and biophysics. Using UF’s boosted computing power, the two will use recent advances in fMRI and image analysis to look for biomarkers for early diagnosis and treatment of AD. In mouse models, the Nino and Febo labs have recently shown that targeted network attacks on brain connectivity matrices reveal differences between diseased and healthy brains. Expanding the work from mouse models to human models, however, will require supercomputing. “Machine learning and AI algorithms can reveal patterns that we cannot identify or detect ourselves,” Nino says. Febo says analyzing the brain as a network and how communication occurs between different regions of the brain requires massive computing power. “One of the limitations on research in AD is not having the computing power for the machine learning and algorithms we need to apply to three-dimensional images, like fMRI,” Febo says. “The computational power we need
John Jernigan
Making Advances with AI
Juan Claudio Nino
16 Summer 2021
Jesse Jones
“One of the limitations on research in AD is not having the computing power for the machine learning … The computational power we need makes it advantageous to be doing this work at UF.” — Marcelo Febo Marcelo Febo
Jesse Jones
Azra Bihorac
makes it advantageous to be doing this work at UF.” Nino describes the network connectivity evolution using a roadmap as an example. If you wanted to travel from Gainesville to Orlando, there are multiple ways to go, and perhaps I-75 is the best. But if there is an accident on I-75, a motorist might try US 441 instead. The brain, too, will try to connect in different ways when faced with an obstacle due to lesions or degeneration of the neural pathways. Although humans often cannot detect such incremental changes in brain connections and activity, an algorithm can analyze millions of connections and help identify potential biomarkers for disruptions in connectivity associated with the onset of AD. The AI catalyst grant also will fund work on mathematical models, which Nino and Febo plan to use to attack connections to see how the brain’s network responds. Finding patterns in the relationship of one area of the brain to another will be a key to learning more about — and eventually treating — AD, which is projected to affect more than 100 million people by 2030.
Eye on the Patient AI can also help in monitoring patients, Bihorac says. “Quite simply, AI can observe the patient in a way no human can because
we don’t have the resources and manpower,” Bihorac says. In one current project, Bihorac and her collaborators examined patient care in the intensive care unit, where one nurse cares for one patient. That seems like a high level of interaction at first glance, Bihorac says. But when the nursing workflow was examined, it showed that nurses only spent an average of 20 percent of their time with eyes actually on the patient. The other 80 percent was spent charting and in other activities. “In other words, the patient is not observed 80 percent of time, even in the ICU,” Bihorac says. “Things that are very subtle, like changes in facial expression and body movements associated with pain or agitation, often go undetected.” Sensors, however, can help. Using sensors placed throughout a hospital room, a patient can be observed more intensively, and sensors can pick up on cues that signal a change in health status more quickly. Sensors can track activity, mobility, fall risk, facial expression and more, and convey the information to doctors and nurses in real time. Bihorac also sees AI as a means of boosting equity in health care, helping health care environments like smaller hospitals and clinics deliver the same level of care as larger, better-resourced institutions.
“AI can help us level the playing field,” Bihorac says. “Not all physicians have the same training; not all hospitals have the same resources. We can give physicians from all backgrounds the same training and tools, and then it really doesn’t matter if you’re in a small community hospital. AI can level the playing field for patients, no matter where they are.” A human caregiver with access to accurate, computer-enabled information will make better decisions than a human without access to that information, Bihorac says. But in any scenario, she says, “the human is the most important element. Azra Bihorac R. Glenn Davis Chair in Clinical and Translational Medicine abihorac@ufl.edu Betsy Shenkman Chair of the Health Outcomes and Biomedical Informatics Department eshenkman@ufl.edu William Hogan Director of Biomedical Informatics and Data Science College of Medicine hoganwr@ufl.edu Juan Claudio Nino Alumni Professor of Materials Science and Engineering jnino@mse.ufl.edu Marcelo Febo Associate Professor and Director of Translational Research Imaging, Departments of Psychiatry and Neuroscience febo@ufl.edu Related Websites: Precision and Intelligent Systems in Medicine Research Partnership https://prismap.medicine.ufl.edu OneFlorida Clinical Research Consortium https://www.ctsi.ufl.edu/ctsa-consortium-projects/oneflorida/
Explore 17
18 Summer 2021
Trusting Tech
Artificial intelligence can combat deepfakes, cybercrimes and snooping By Cindy Spence
When you can’t trust your own eyes and ears to detect deepfakes, who can you trust? Perhaps, a machine. University of Florida researcher Damon Woodard is using artificial intelligence methods to develop algorithms that can detect deepfakes — images, text, video and audio that purports to be real but isn’t. These algorithms, Woodard says, are better at detecting deepfakes than humans. “If you’ve ever played poker, everyone has a tell,” says Woodard, an associate professor in the Department of Electrical and Computer Engineering, who studies biometrics, artificial intelligence, applied machine learning, computer vision and natural language processing. “The same is true when it comes to deepfakes. There are things I can tell a computer to look for in an image that will tell you right away ‘this is fake.’” The issue is critical. Deepfakes are a destructive social force that can crash financial markets, disrupt foreign relations and cause unrest and violence in cities. A video, for example, that appears to be a congressman or even the president saying something outrageous and untrue can destabilize foreign and domestic affairs. The potential harm is great, Woodard says.
Some fakes seem innocuous, like photoshopping the model on the cover of Cosmo or adding Princess Leia to a Star Wars sequel, for instance. And for now, imperfections in teeth, hair and accessories provide clues for eagle-eyed skeptics. But images can spread quickly, and it is difficult to get the mind to unsee — and unbelieve — a deepfake, making machines an important ally in the fight against them. “You could run thousands of these images through and quickly decide ‘this is real, this is not,’” Woodard says. “The training takes time, but the detection is almost instantaneous.” Deepfakes are just one element of Woodard’s work at the Florida Institute for Cybersecurity Research. Woodard also uses machine learning to analyze online text to establish authorship. Machine learning also is central to Woodard’s research to detect hardware trojans inserted onto circuit boards, which are a national security concern since they are largely manufactured outside the US. When it comes to online text, the way a person uses language can reveal his identity. Online predators who use chat rooms and other digital communities leave a trail of textual communication. Woodard and his team have come up with ways to identify people solely by the way they compose those online messages.
Explore 19
Damon Woodard
“These models could have billions of parameters. The research we’re conducting typically involves data sets that are at least a few terabytes nowadays. Before you can even evaluate a model to see if it’s good, it has to finish the training.” — Damon Woodard
20 Summer 2021
“You can read a book by your favorite author, and I don’t need to tell you the author, you just know by reading the book; you know the person’s writing style,” Woodard says. “This is kind of the same thing.” Woodard says the work complements offline physiological means of establishing identity, such as biometric tools like fingerprints, irises and facial scans. “This gives us the ability to identify that person in cyberspace,” Woodard says. Another area of national security is hardware trojans. Manufacturing of computing hardware largely occurs outside the US, making it susceptible to manipulation by foreign actors. Image analysis, computer vision and machine learning can help in three ways: • Detecting when something has been added to hardware that shouldn’t have been. • Detecting when something is missing that should be present. • Detecting a modification of something that should not have been modified.
With integrated circuits, Woodard says a computer can use an image to extract important features in all three cases. Circuits can be represented as graphs, and machine learning and deep learning can be applied to do graph analysis, which determines whether a graph is correct or has been modified. “You want to be sure everything you are looking for is here and what shouldn’t be there isn’t there,” Woodard says. With printed circuit boards, an image from a regular optical camera can be used. Computer vision would then be able to tell how many resistors are on the board, how many capacitors, how many connections. The process would be automatic and fast. And getting faster, with the help of NVIDIA computing power, Woodard says. “These models could have billions of parameters,” Woodard says. “The research we’re conducting typically involves data sets that are at least a few terabytes nowadays. Before you can even evaluate a model to see if it’s good, it has to finish the training.”
Supercomputing speeds up research. Just a few years ago, a researcher would have had to choose a direction of inquiry to invest in, perhaps only realizing she needed to go in another direction after the first path had run its course. With the supercomputing available via NVIDIA and HiPerGatorAI, researchers can explore multiple directions at once, discarding the models that don’t work, refining the models that do.
John Jernigan
Pictures and Privacy The camera on your smartphone actually isn’t all that smart. When you point it, everything in the viewfinder ends up in the photo. A truly smart camera could edit the scene before it takes the photo, says Sanjeev Koppal, a computer vision researcher in the Department of Electrical and Computer Engineering. Koppal and his team in the Florida Optics and Computational Sensor Laboratory are working on building such cameras, which can be selective about visual attention. “We want to be intelligent about how we capture data,” Koppal says. “That’s actually how our eyes work. Our eyes move a lot. They’re sort of scanning all the time. And when they do that, they give preferential treatment to some parts of the world around them.” “I’m interested in building cameras that can be intelligently attentive to the stuff around them.” The idea of attention is really old in AI, Koppal says. Attention — knowing what to pay attention to — is a sign of intelligence. For animals, that might be food or predators. For a camera, it’s a little more complicated. A camera that can select what to capture in a scene requires very fast sensors and optics, so that the camera can process the visual cues in a scene quickly. But it also requires algorithms that live in the camera that can change quickly depending on what the camera sees. This AI-inside-the-camera approach can be important for safety and privacy. One camera Koppal’s lab built can put
To demonstrate the sophistication of deepfakes, Damon Woodard's lab created the three images at right. The people appear to be real but do not exist.
Explore 21
— Sanjeev Koppal 22 Summer 2021
John Jernigan
“We want to be intelligent about how we capture data. That’s actually how our eyes work. Our eyes move a lot. They’re sort of scanning all the time. And when they do that, they give preferential treatment to some parts of the world around them.”
more emphasis on sampling for depth for objects of interest. Such an approach is useful for developing more advanced LiDAR, or light detection and ranging, a remote sensing method that relies on laser pulses and is often used for autonomous vehicles. The advantage of an adaptive, intelligent LiDAR is that it can pay more attention to pedestrians or other cars rather than the pavement in front of it or the buildings that line a roadway. In the lab-built version, fast mirrors change the focus of the camera very quickly, such that it might more easily capture a fast-moving object — a child running into the street, for example. The training data for street scenes is plentiful and diverse, making it possible to create algorithms that select which features can be dropped out of a street scene while allowing an autonomous vehicle to navigate safely based on the features that need to remain in view. The test, Koppal says, is whether the algorithm — given a new image — knows what to pay attention to. The camera uses tiny mirrors that move so fast it’s like having two cameras trained on a scene. The goal, Koppal says, is to develop a camera that can process a complicated scene, such as 10 pedestrians walking at a distance. The tiny mirrors can flip between the distant images, making sense of the scene better than today’s LiDAR. “The question becomes, where do you point this?” Koppal says. “That’s where intelligent control and the AI come in.” The camera would need to be taught to pay more attention to the group of pedestrians, for example, than to the trees lining the sidewalk. Another camera in Koppal’s lab can see better than humans, even around corners, and detect images not visible to the human eye. The camera uses a laser that bounces off of objects hidden from human view and travels back to the camera, where the light is analyzed to determine what is hidden. The camera can use the same technique to sharpen images that would be blurry to the human eye. “Imagine it’s raining on your windshield, and you can’t see anything,” Koppal says. “This camera can still see. It could help you. It gives you one more chance to see the same scene.” While much of the focus is on cameras that can do more than the human eye, equally important work is being done on cameras that perceive less than the human eye, Koppal says. Prototypes of these cameras create privacy inside the camera while the image is being formed. “It’s not a small point,” Koppal says. “If you capture an image, the data exists somewhere and now you have to remove the faces if you want to protect privacy. I say, don’t capture it at all, and the images you end up with are already private.”
In some situations, there is a need to monitor people, but without identifying them, such as an assembly line, where a company may want to monitor employees’ safety without identifying individual employees. Another example of the need to see some things but not others is the humble household robot, such as a Roomba vacuum cleaner. Koppal says any robot in the home could be picking up sensitive information along with household dirt. A robot navigates a home by a method called SLAM, simultaneous localization and mapping, which used blue and red point cloud files which are low resolution and don’t look like much. The images are usually deleted after a short period. “It’s private, just this blue and red thing. But we showed that you can invert this to get back the image and see somebody’s house. You can see
what kind of products they use, and you could imagine if there was paper and enough of the dots, you could read their taxes,” Koppal says. “Your robot reading your taxes is unacceptable.” Fixing the issue of the nosy robot would require a camera Koppal’s lab has designed that can learn how to be good at some tasks — navigating your living room — but bad at others — like reading your taxes. “This is pure AI,” Koppal says. Such a camera can preserve privacy while fostering security. For instance, if a line of people is going through a metal detector at a courthouse, the camera can avoid capturing their faces. But if the camera detects a gun or knife, that signals the camera to capture the face and show the identity of the person carrying the weapon. “It’s easy to fool a human. You just put a black box over a face. But it’s not
easy to fool a machine, especially AI machines,” Koppal says. “Data that may not look like anything to us can be read by a machine.” Privacy, Koppal says, is something people often don’t value until they lose it. In what he calls the “camera culture” of his lab, the researchers like to think about the future of camera technology. “In the future, there’ll be trillions of very small connected cameras,” Koppal says. “So, I think it’s important, before those cameras are built, to think about how they function in the world. This is the time.” Damon Woodard Associate Professor of Electrical and Computer Engineering dwoodard@ufl.edu Sanjeev Koppal Assistant Professor of Electrical and Computer Engineering sjkoppal@ece.ufl.edu
Sanjeev Koppal's lab developed a lightweight projector and algorithm that can see objects obscured around a corner or by opacity, such as rainwater on a windshield. At right, a projector scans the scene as a camera looks from the side through water-covered glass. The system allows the camera to pick up the images on the playing cards.
The lab also conducted an experiment to show that point clouds captured using structure from motion, at left, can be reconstructed to reveal a detailed picture, at right. The experiment was the first demonstration that structure from motion images can be reconstructed into actual scenes. Many 3D vision systems use point clouds to navigate, as in a robotic vacuum cleaner.
Explore 23
Capturing a World of Data
24 Summer 2021
By Cindy Spence
The frontiers remaining in the natural world today are not in the thickest jungles, deepest oceans and highest mountains. For naturalists today, the last frontier is data. Robert Guralnick, the biodiversity informatics curator at the Florida Museum of Natural History, says data science approaches, particularly machine learning, can help with the critical challenge of extracting the best data generated by an ever-more-closely monitored environment and using it to save global biodiversity. “We really need to be able to do this and do it well,” says Guralnick, “and relatively quickly. Data limitations are perhaps the key impediment in understanding just how quickly the planet is changing and the consequences of those changes.” Naturalists and scientists still use field notebooks, but to those analog tools they are adding the tools of artificial intelligence: • From space, satellites monitor Earth around the clock. • Closer to ground, drones provide surveillance of any terrain. • Some sensors record readings such as temperature and moisture, while others record the sounds of birds and insects. • Camera traps record the behaviors of animals when humans are not around or seasonal changes in plants. • And environmental DNA can be collected and analyzed with the latest sequencing equipment.
Kristen Grace
Computing power is the key to analyzing a changing environment
Robert Guralnick
There is much to learn, with the help of AI and computing power. “Ecology is now a big data science, but with all the data we are generating — and we are generating those at increasing rates — are we generating more rapid insights we can use to get ahead of environmental challenges?” Guralnick asks. “The answer to that question is, I think, not yet. We have to move faster. We have to be better at this.” One ecology app — iNaturalist — illustrates the data explosion. The app lets users identify plants and animals around them with the help of other users and often results in research-grade observations. Started in 2008, the app has recorded 54 million observations of more than 305,000 species. Just in August 2020, in 31 days, users uploaded more than a million photographs just of plants, covering about 20,000 species, about half a million of them from the U.S. Another platform, eBird, allows people to use their cellphones to record and share birdwatching data. Involving citizen scientists to collect data and then using digital tools to analyze the data has allowed researchers to monitor bird movements and migrations better than ever. “We use machine learning to classify where birds are and what they are doing across space and time,” Guralnick says. “Collective birdwatching has changed the world.” And while iNaturalist today records about 200,000 daily Explore 25
observations of plants and animals, one day that number will be 1 million, and then 5 million. “That’s a hugely powerful resource for understanding what’s happening in the environment and making sense of it,” Guralnick says. “These platforms can be transformational,” Guralnick says. “AI is perhaps the critical toolkit here to actually make progress.”
Trees cover 31 percent of the world’s land area, and the ecosystem services they provide are valued in the trillions of dollars. Since plants are the foundation of all terrestrial ecosystems, and trees rule the plant world, understanding changes in forests is a key to protecting the ecosystem services — or benefits — they offer. An ecosystem can be an urban trail, a state park, a national forest. Understanding ecosystems as the scale grows becomes more challenging and requires both teamwork and technology, says Ethan White, an associate professor in the Department of Wildlife Ecology and Conservation, a part of UF’s Institute of Food and Agricultural Sciences. A multidisciplinary research team — Integrating Data science with Trees and Remote Sensing (IDTReeS) — was formed with faculty including Alina Zare in electrical and computer engineering, Stephanie Bohlman in the School of Forest, Fisheries and Geomatics Sciences, Daisy Wang in computer and information science and engineering, and Aditya Singh in agricultural and biological engineering. The group is developing ways to identify individual trees in large forests. “Changes in forests due to climate change, disturbance, land use change and forest management influence carbon storage, economics and ecosystem services,” Bohlman says. “These changes depend fundamentally on the characteristics of individual trees, but it is traditionally only possible to collect this data at very local scales using 26 Summer 2021
people-on-the-ground field techniques. “Remote sensing from satellites, aircraft and drones has the potential to allow us to measure individual trees across huge areas,” Bohlman says. “That creates the potential for more informed decisions about forest management and responses to climate change.” To develop machine learning methods for doing this at large scales, White says, it made sense to use the data available from the National Ecological Observatory Network. UF is a leader in NEON, which is funded by the National Science Foundation. NEON uses flyovers to collect photographic data for ecosystems from Puerto Rico to Alaska, and White and his collaborators used the photographs to create algorithms that identify millions of individual trees in each of these forests, including over 5 million trees at UF’s NEON site — the Ordway-Swisher Biological Station. “At this phase, we want to know if we can generate data at this scale in a way that is sufficiently close to the kinds of information we’d get from really intensive field work,” White says. “Our work so far shows that we can determine where the trees are, and how big they are, and we can do this for over 100 million trees.” Because photographic data is becoming more widely available, White says the methods the team has
used can be applied widely, although scaling out to the entire U.S. would still take a long time. And, as more data pour in, the task becomes even more computationally challenging. “Everything scales really, really quickly,” White says. “But we’re getting faster and better, and HiPerGator has a substantial amount of resources, so we can go quite a bit bigger.” While carbon storage is a huge question for forestry, White says providing a method that can answer other questions at larger scales also is important. “A lot of times when we ask ecological questions, we ask them at the scale that a graduate student or small team can go out and collect the data in the field,” White says. “But we’re often interested in applying data at much larger scales. So what we’re trying to do is produce the kinds of data we’d produce in the field — size, species, leaf traits — to answer a broad suite of ecological questions. Where are the largest trees, where is the most biomass, where are the most biodiverse regions? “We are building a platform for research by providing large-scale data on forests quite broadly. We don’t just analyze it ourselves. We turn it into data products and make the products publicly available so other people can work on them and do ecology with them.”
Tyler Jones
Ecosystem Services
“We are building a platform for research by providing large-scale data on forests quite broadly. We don’t just analyze it ourselves. We turn it into data products and make the products publicly available so other people can work on them and do ecology with them.” — Ethan White
Collaborations in ecology pull from multiple departments. From left, Daisy Wang, Ethan White, Stephanie Bohlman, Alina Zare and Aditya Singh, who are working together on analyzing forest-level data.
Explore 27
Nature’s Voices Images are not the only inputs. Sound, too, is a key data source in documenting the natural world, says Brian Stucky, an AI facilitator and consultant with UF Research Computing. Animal sounds convey significant information: • Which species are present in an ecosystem. • Whether there are seasonal patterns to activity. • How animals interact. • How abundant a species is in an ecosystem. • Are there new sounds, indicating species that are not yet identified. One example of sound as big data, Stucky says, is research on frogs, which are in steep decline across the world. Doctoral researcher Greg Jongsma studies African frogs, and in 2019 he visited Gabon for field work. He strapped two recorders to trees and turned them on. One recorded 10 days of sounds, the other six days. They yielded 380 hours of bioacoustics data. “As much as I love bioacoustics,” Stucky says, “nobody I know is crazy enough to sit down with a pair of headphones and try to manually analyze 380 hours of audio.” The natural world is a noisy place 24/7, so figuring out how to extract only the frog species was a laborintensive proposition. The cacophony included frogs, to be sure, but also birds and insects. A biology student, Katie Everett, volunteered to take a stab at the task. “Eventually, we heard from Katie that after spending about two hours, she hadn’t even made it through two minutes of audio,” Stucky says. Stucky says the team developed new methods of annotating audio by hand to more quickly generate the training data needed to build the AI system to 28 Summer 2021
analyze the full dataset. The system was able to identify the calls of the target frog species with great accuracy. “This is a work in progress,” Stucky says. “We’re still actively experimenting with various network architectures. Our goal here is to use these tools to analyze the daily patterns of calling activity for four key frog species.” Stucky says the AI methods have allowed the team to analyze all the audio and describe when the newly discovered frog is most active during a typical day at a level of detail that is not available for any other species of frog in Africa. “It’s breaking new ground in that way as well,” Stucky says, and Jongsma agrees. “If Brian had not developed this amazing AI-driven approach to tackling these audio recordings, they likely would have remained on SD cards, collecting dust, quite possibly never to be used,” Jongsma says. “I can see so much potential for asking big questions that would have been insurmountable in terms of bridging big data and longterm field data collection without an AI specialist like Brian.” If AI can do the heavy lifting, Stucky says, scientists can find new ways to answer biological questions, simply by deploying acoustic monitoring and analyzing the sounds.
Data and Diversity With Earth in the midst of what has been called the sixth great extinction and species disappearing 1,000 times faster than normal, there’s little time to spare in conquering the data frontier for natural resources. “Documenting the rate and drivers of biodiversity change is critical because those losses have important consequences for ecosystem services
that underlie human society, like food, fuel and fiber,” Guralnick says. “We’re moving toward disequilibrium and losing diversity before we even know what we’re losing.” The traditional mainstay of ecological study — field work — isn’t going anywhere. But more and more, naturalists will be teaming up with data scientists, or learning the skills of data science for themselves. White is a big believer in data science tools in the service of ecology. He helped build the national Data Carpentry organization, which teaches researchers how to use the tools of data science like programming languages and databases. This spring, the IDTReeS group ran a data science competition in which teams used open remote sensing data to design algorithms to identify and classify trees. “The idea behind the competition is that teams will develop new ways to process ecological information, and those methods can benefit scientists everywhere,” says Wang, one of the leaders of the program. “An open challenge can attract solutions from a broad range of participants and help us refine methods to help us solve problems from various research teams. “Our efforts are among the first to apply data science to the field of ecology,” Wang says. Wang, Zare, and White also teach classes to help lay a foundation for future ecologists in tools like machine learning. Raising baseline tech skills for scientists can have a huge impact on ecology, White says. “We can’t just get someone from Google to handle data for us. This requires a real fusion of domain expertise with advanced technical skills, so it’s essential for these projects to have a whole range of people from biological experts through machine learning and computer science experts who are all capable of interacting and talking with one another. That’s a really difficult thing to accomplish, and it takes time to figure out.”
John Jernigan
One of Greg Jongsma's camera traps in Gabon.
Brian Stucky
Cross-disciplinary collaborations, like the one with Zare’s Machine Learning and Sensing Lab and Wang’s Data Science Research Lab, leverage both skillsets. Zare says the computer scientists may not be experts on an ecological problem, but if an ecologist hands her team a curated dataset, it could have a big impact. Ecologists, too, could have more impact with an assist in coding and machine learning methods. The back-and-forth discussions about creating a meaningful dataset provide insights to both sides. “We’re a unique group,” Zare says. The goal is for such collaborations not to be unique for long. White says ecology offers legitimately difficult problems that will require big interdisciplinary teams. “A key component of these approaches is large amounts of highquality field data to even begin to develop models in the first place. And because systems change through time,
the need for new field data is never going away,” White says. “By combining this field work with remote sensing and analysis, we can do so much more than any of us could do on our own,” White says. Guralnick says integrating “pixel views” of the world with observations made on the ground will require another kind of collaboration — one between humans and machines. It’s important to automate tasks humans previously have done because machines are faster at those tasks. But it’s equally important to keep humans in the loop. Human intelligence will be needed to make automation succeed. “We have managed to develop a remarkable set of tools, especially in the last 50 years, for monitoring the environment,” Guralnick says. “Now we have the enormous challenge of integrating these data into a coherent picture that could be useful for solving problems. “It is a frontier challenge.”
Robert Guralnick Biodiversity Informatics Curator, Florida Museum of Natural History rguralnick@flmnh.ufl.edu Brian Stucky Biodiversity Informatics Researcher, Florida Museum of Natural History stuckyb@ufl.edu Related Website: Integrating Data science with Trees and Remote Sensing https://idtrees.org
Explore 29
30 Summer 2021
Building ethics into artificial intelligence systems By Cindy Spence
Artificial intelligence and computer science researchers say getting machines to do the right thing has turned out to be relatively easy. We program Roombas to vacuum our homes, but don’t expect them to brew our coffee. We program robotic arms to sort parts in factories, but not to decide which colors to paint cars. We program doorbells to tell us who is at the door, but not to let them in. Most of our machines do one thing and do it well, usually in error-free fashion. They get the task right. But getting machines to do the right thing — the ethical thing — now that’s a different problem. And, for now at least, it has a lot more to do with getting people to do the right thing. Duncan Purves, an assistant professor of philosophy, specializes in emerging ethical issues for novel technologies, artificial intelligence applications and big data applications. Machines run on algorithms and do what algorithms tell them to do. But algorithms are mostly designed by people, and it’s challenging, Purves says, to create an algorithm that aligns with our ethical values. “One way to think about ethics is as a set of principles or rules that determine how we ought to behave, so that ethics are about action, behavior,” Purves says. “The ability to think ethically is what distinguishes humans from animals.” And from machines. If ethics are the guidelines that determine human actions, algorithms are the guidelines that determine the actions of machines. Algorithms already permeate our lives: who shows up on our dating apps, which job applicants make it into a hiring pool, who gets a mortgage or car loan, which route we travel from point A to point B, which advertisements we see
on social media, which books Amazon recommends, who gets in to college, and where and when we deploy police. Ethical considerations often don’t have legal consequences, but they have consequences that matter, nevertheless. Purves uses the example of keeping a secret. “You can tell me a secret, and I promise not to tell the secret, but then I do. It’s not a crime. It might damage my reputation with others, or I could lose your friendship, but these are not the only reasons I would give for keeping a secret. I would simply say, because I promised I would keep it, keeping it is the right thing to do,” Purves says. “A commitment to doing the right thing is what motivates me.”
Who Controls Data? Just as there is no law prohibiting you from telling a friend’s secrets, there are few laws today about collecting and using data that feed algorithms. Many of the issues with data collection and algorithms only become apparent after an application is in use, Purves says. The data collected about us — from our cellphones, GPS trackers, shopping and browsing histories, our social media posts — add up to a bonanza for marketers and researchers. Both commercially and scientifically, the data have value. But the people generating the data often don’t control how the data are used. And the data can be used to develop algorithms that manipulate those very people. We give away our information in byzantine terms of service agreements that we often don’t read. Social media platforms and dating apps often use us for A/B testing, a fairly routine and benign use, but also for their own research. Explore 31
“We need a protected sphere of control over information about ourselves and our lives. Controlling information about ourselves helps us to shape our relationships with other people.” — Duncan Purves Purves points to a Facebook example from 2012, when the platform manipulated the newsfeed of selected users, showing some users positive articles and showing others negative articles. The results demonstrated emotional contagion: Those who saw positive articles, created more positive posts themselves. Those who saw negative articles, posted more negative articles themselves. No one asked the users if they wanted to take part. In another example in 2015, the dating app OkCupid experimented with its matching algorithm, removing photographs in one experiment and telling users in another they were good matches with people they otherwise would not have matched with. When the manipulation was revealed, the CEO said the experiments were fair game since the users were on the internet. These cases were not illegal, but were they ethical? Purves says it might be worth exploring a review process for data science experiments along the lines of the review process for health-related experiments. 32 Summer 2021
“Do data scientists have the same obligations to data subjects as medical researchers have with their subjects?” Purves asks. Another issue is the privacy of our data. Privacy, Purves says, is a basic human need. “We need a protected sphere of control over information about ourselves and our lives,” Purves says. “Controlling information about ourselves helps us to shape our relationships with other people. Part of what makes my relationship to my wife a special one is that I choose to share information about myself with her that I would not share with my colleagues. In a world where everyone had perfect information about me, this selective sharing would not be possible.” Privacy also protects us from those who would use our information to harm us, for example, by talking about our private history with addiction or disease. “When we lose control over access to our personal data, we lose some degree of privacy,” Purves says. Still, all these data are too tantalizing to lock away. A method known as differential privacy aims to provide as much statistical data as possible while introducing “noise” into the data to keep it anonymous. Differential privacy, for example, ensures someone can contribute her genetic information to a database for research without being identified and having the information used against her. With a security guarantee for users, researchers can use the data to make new discoveries.
Predictive Policing Algorithms designed to classify information into categories do their job well, Purves says. But optimizing algorithms to meet our social values is tricky. Purves and a colleague at California Polytechnic State University are exploring the intricacies of algorithms used in predictive policing in a $509,946 grant from the National Science Foundation. On the surface, using algorithms as a crime-fighting tool makes sense. Many large departments from Los Angeles to New York use predictive policing to stretch resources because algorithms can assess crime hotspots — predicting where and when crimes will occur — and replace people who do the same thing. “Police departments don’t adopt these technologies for no reason,” Purves says. Machines also may be more accurate and potentially less biased than human officers — with a caveat. Some algorithms are based on historical arrest data, and if those data are a function of pre-existing discriminatory practices by police officers, the algorithms will reflect that bias and reinforce it. Algorithms trained on biased arrest data will recommend greater police presence in communities of color, Purves says. The presence of more officers will yield more arrests. The increase in arrests will be used as a proxy for
Unmanned military drone
higher crime rates, and the cycle becomes a feedback loop. Purves and his colleague are trying to determine whether other kinds of less biased data can be used to train algorithms. Another avenue of investigation, he says, is the gap between identifying where crime happens — assuming it can be done accurately — and what to do with that information. PredPol, a predictive policing software, can identify a 500-by-500-square-foot area susceptible to, for example, vehicular theft at a particular time and day. The department can respond with beefedup patrols. “An important question is whether increased patrols is the best police response to crime predictions,” Purves says. “There are others available.” A better response to such predictions might be to alter the physical environment, rather than put more officers in the neighborhood. It may be possible, Purves says, to disincentivize crime by demolishing abandoned buildings and installing better street lighting, which would avoid unintentional violent confrontations between officers and citizens. “That’s a feature of these technologies that’s been underexplored,” says Purves, who serves on several dissertation committees for computer science graduate students interested in ethics. “We’ve got the technology, it can anticipate crime and even do so
effectively, but how should we respond to those predictions?” Purves and his colleague hope to produce a report of best practices police departments can use in deploying AI at the end of the three-year grant next year.
War Machines Purves’ interest in the intersection of AI and ethical issues started with an email from a friend asking about ethical concerns with using autonomous weapons in warfare. “Autonomous weapons are interesting from the perspective of ethicists because they’re essentially machines that decide to kill people,” Purves says. Some concerns include how a machine identifies a gun on a battlefield or distinguishes between an innocent civilian and a combatant. But even in a world where the technical capabilities are perfect, there are other issues with machines that decide to take life. “No matter how sophisticated you make a machine, it’s very unlikely that it’s going to ever develop the moral intuition or moral perception we make use of in issuing our moral judgments,” Purves says. “Without these distinctly human capacities, you could never rely on it to make sound moral judgments in warfare.” But suppose machines could develop these human capacities. That presents still another set of problems.
“If you have reason to believe that you can rely on the moral judgment of an autonomous weapon system, then you also have reason to believe that you should care about what you do to that system,” Purves says. “Can you send it off to war to kill people without having asked its permission? You’re caught in a kind of dilemma.” Finding commonality with the awareness or consciousness of a machine is difficult for humans. With other humans and with non-human animals, we have a kind of solution to the problem of other minds, Purves says. We share an evolutionary past and a physiological composition. “We share enough that I can say, ‘If I have these features, and I’m having these experiences, then I have reason to believe that if you share those features you also have those experiences,” Purves says. “We don’t have any of that shared history or shared physiology with machines.” With weapons of war, there may be an argument that if machines make fewer mistakes than humans on the battlefield, then we should deploy them, Purves says. Give them very circumscribed tasks and limit their moral mistakes. The dilemma of what we want our algorithms to produce — accuracy AND fairness — may require a deeper look into our society. “Structural power imbalances in society can be the source of some of the greatest ethical challenges for AI and big data,” Purves says. “To create algorithms that align with our ethical values, sometimes we must think deeply about what those ethical values really are. “In some sense, the dilemma is not for the data scientists,” Purves says, “but rather a dilemma for our own concepts of fairness.” Duncan Purves Associate Professor of Philosophy dpurves@ufl.edu
Explore 33
E tracts Waste Not, Want Not Detector aims to reduce food waste f you have ever brought home seemingly fresh produce from the grocery store only to find it wilted and moldering a few days later, Tie Liu feels your pain. “Everybody has this problem: Which of these vegetables or fruits should I use first? Guess wrong, and you end up throwing out the food,” said Liu, a postharvest researcher and assistant professor in the UF/IFAS horticultural sciences department. Consumers are not the only ones who have this problem, Liu added. Spoiled produce is also a problem for the food transportation industry and has environmental impacts as well. University of Florida researchers are using genomics-based approaches, artificial intelligence and hyperspectral imaging to develop a device that can scan produce for freshness and optimize how and when produce is shipped. “If you are transporting food from the farm to retailer, knowing the freshness of the produce can help you better plan ahead and maximize the freshness of the product,” Liu said. “But right now, there is no quick, easy way to know how long, for example, a head of broccoli has until it’s no longer fresh.” As soon as something is harvested, a series of chemical changes begin inside the fruit or vegetable, and the produce starts to degrade. Scientists call this process “senescence.” That broccoli might look fine to the human eye, but inside, the clock is ticking. That is why Liu is investigating whether it is possible to develop a hand-held or wearable device that can look beyond the visual spectrum to determine freshness. “If successful, this device will tell you how fresh produce is and help food transporters and packing houses better time shipments, increasing efficiency and providing fresher food for the consumer. Think of it like facial recognition, but instead of identifying faces, 34 Summer 2021
Adrian Berry
I
Tie Liu conducts postharvest research in the UF/IFAS horticultural sciences department.
the device would identify freshness level,” Liu said. Funded by a nearly $500,000 dollar, four-year grant from the USDA National Institute of Food and Agriculture’s (NIFA) Agriculture and Food Research Initiative (AFRI), the project, called FreshID, also tackles a global problem: food waste and loss. “About 40% of food produced is wasted. A significant proportion of that waste and loss occurs between the time the produce is harvested and when it gets to the consumer. The FreshID project aims to help make sure that food gets to the consumer at the right time,” Liu said. The project, led by Liu , includes researchers in the UF Herbert Wertheim College of Engineering. The team will combine their expertise in plant molecular biology, hyperspectral imaging, computed tomography (CT) imaging and artificial intelligence (AI). “Ultimately, we want to understand how physiological, biochemical and molecular changes going on inside produce can be detected using X-ray CT imaging and a hyperspectral cam-
era, which is a camera that can see the whole spectrum of light, not just the blue, green and red pixels of a normal camera. We would then use that data to train a computer, using machine learning, to recognize the visual signals that indicate freshness levels,” Liu said. “I am excited to be working with Dr. Liu and bringing our two areas of study together,” said Alina Zare, co-principal investigator on the grant. Zare is a professor in the department of electrical and computer engineering, where she leads the Machine Learning and Sensing Lab. With the help of Zare’s lab, Liu will conduct experiments with heads of broccoli, avocadoes, strawberries and potatoes. However, he hopes these experiments provide insights that can be applied to many different kinds of crops. “We are starting with fruits and vegetables, but our research aims to have broader impacts for agriculture and the reduction of food waste,” Liu said. Samantha Murray
AI and the Arts UF College of the Arts adds four faculty to lead new initiative
F
our scholars representing a variety of artistic and creative disciplines are joining the UF College of the Arts as part of its efforts to integrate artificial intelligence broadly into the curriculum. “I see this cohort of four new members of our faculty, dispersed as they are in four distinct units of the college, as the core of a groundbreaking opportunity for the College of the Arts to lead in the context of UF’s focus on the development of artificial intelligence,” said Dean Onye Ozuzu. The college seeks to explore how artistic practices — such as critique, improvisation, and a focus on process over product — can inform how artificial intelligence is cultivated and used and how its influence on society is designed. “The arts allow us to examine our world and fashion possible futures,” said Amelia Winger-Bearskin, who joins the Digital Worlds Institute as associate professor and Banks Preeminence Chair of Digital Arts and Sciences. “The arts can also guide us as we imbue technology with our own ethics and values as we work together to realize the dreams of AI.” Heidi Boisvert, who joins the School of Theatre and Dance as assistant professor of Immersive Performance Technologies, said she looks forward to building a curriculum that reflects artists’ responsibility to advance equity and access in the context of artificial intelligence. “As artists and creative technologists working with AI, machine learning, natural language processing, synthetic media and more, we have a responsibility to ensure that these emerging technologies become more accessible and inclusive for diverse communities, are grounded in ethics, and are implemented for societal good, not corporate profit or systems of oppression,” Boisvert said. “I am excited to co-design a curriculum that reflects these values and
encourages students across the College of the Arts to become provocateurs and gadflies challenging the trajectory of future technology through critical discourse, artistic expression, open-source invention, and social impact.” Fatimah Tuggar, who joins the School of Art and Art History as associate professor of Art and Global Equity, said “I am enthusiastic about working with students on algorithmic literacy and transparency. Positively, this will lead to students creating artworks using AI that avoid the traps of applying technological answers to aesthetic questions and grounding students in developing more empathy and contextual awareness for humanity, other life forms and the environment.”
Tina Tallon joins the School of Music as assistant professor of Music Composition. She said she looks forward to being in an interdisciplinary environment that engages multiple aspects of her professional expertise. “I didn't fully appreciate what it meant to find one's ‘dream job’ until UF announced this initiative,” Tallon said. “Its highly interdisciplinary nature allows me to fully embrace all the different facets of my work as an artist, engineer, historian and educator in service of society, which I’ve never found space to do anywhere else.”
Amelia Winger-Bearskin
Fatimah Tuggar
Heidi Boisvert
Tina Tallon
Brandon McKinley
Explore 35
E tracts AI For ALL Addressing AI impact on communities of color
A
rtificial Intelligence has a color problem. Various studies have demonstrated how African Americans, in particular, are negatively affected by AI-based algorithms. For example, a 2019 article in Nature reported on a study that showed an algorithm widely used in U.S. hospitals to allocate health care to patients has been systematically discriminating against Black people. The study concluded that “the algorithm was less likely to refer Black people than white people who were equally sick to programs that aim to improve care for patients with complex medical needs.” Other studies have shown AI-based discrimination in predictive policing, facial recognition, employment application screening, bank loan approvals, and many other areas. Part of the problem may be that most AI applications are created by white engineers who lack the cultural perspectives of communities of color. As part of a report on racial equity and emerging digital technologies for the UCLA School of Law, UF telecommunications Associate Professor Jasmine McNealy wrote that “Creators embed their creations with their own values, and values reflect culture and politics. If communities are outside of the scope of the creator’s purview, they may fail to recognize the consequences of that technology for that community.” In the report, McNealy discusses both intentional and unintentional neglect by coders, which she describes as “the creation, use, and deployment of technology without regard for the disparate impacts they may have on communities different than the imagined audience.” McNealy, who is also associate director of the Marion B. Brechner First Amendment Project and a member of the Consortium on Trust in Media and Technology Trust Scholar, has been exploring the impact of AI on marginal-
36 Summer 2021
ized and vulnerable communities. “You can’t start from the perspective that we need to make a technology equitable, because technology reflects society,” she says. “The problem is how do we look at the system in which the technology is going to work or be active or behave and try to make that system more equitable?” McNealy adds that technology “amplifies the problem.” “People think, ‘Oh, technology is neutral,’ and it’s not. And then people double down on it not wanting to stop using the technology because we like to be efficient. We like to use tools. That’s the nature of humans. But these extensions of ourselves are just making the problem worse, and we still haven’t fixed the underlying root cause. Until that happens, technology can continue to amplify these tragic and terrible events and systems that we have in place already.” In October 2020, McNealy received a prestigious Google Award for Inclusion Research for her project exploring community-based mechanisms to combat algorithmic bias. In December, she was named one of the “100 Brilliant Women in AI Ethics” during the Women in AI Ethics Summit. Telecommunication Professor Sylvia Chan-Olmsted and advertising Associate Professor Huan Chen believe AI can help address inequities in the dissemination of information, which is often distributed uniformly at scale without any cultural considerations. This can lead to unfairness in information access for certain social groups due to inherent, unique cultural backgrounds. Chan-Olmsted and Chen were recently awarded a UF AI Research Catalyst Fund grant to study “Fairness in Information Access Through Culturally Competent AI Systems.” In their grant proposal, the scholars wrote that “Access to information is es-
Jasmine McNealy
sential in today’s knowledge economy and fundamental to American democracy. However, certain groups of the population might be excluded or lack access to participate fully in public discourse/economy because their cultural background presents obstacles to access or comprehend the uniformly disseminated information without consideration of cultural relevance. Such information access inequality can result in social injustice to certain cultural groups.” Their research will explore how sophisticated AI models might incorporate more intricate cultural dimensions by identifying means of disseminating information effectively through cultural resonance. “The project is novel in that it seeks to develop a fair AI system by using social theories to inform the training of [machine-learning] models and addresses the critical but challenging aspect of culture in a multicultural society,” according to their proposal.
Chatting Up Employees, Customers AI-driven chatbots can improve communications
B
uilding upon her background in corporate communication research, public relations Associate Professor Rita Men went into high gear during the pandemic to study effective communications from CEOs as well as chatbots used for social listening. In collaboration with public relations Chair Marcia DiStaso, Men developed a model to find out how executives were communicating with their employees to maintain their trust, build relationships, engage them and help improve their feeling of well-being during the pandemic. The model examined leaders’ transparency, authenticity, empathy and optimism in communications “and how that can help reduce employees’ feelings of uncertainty and boost employee trust and engagement during the pandemic,” she said. The team surveyed more than 1,000 employees from a variety of industries to assess executive leadership communication and how effective it was. The ability to instill trust and confidence, and offer hope were traits the best communicators exhibited. It’s not much different with chatbots and other artificial intelligence technology being used on a daily basis now: The most effective ones are those that most sound and act like humans, Men said. But the question for businesses is: Is AI worth the investment? Men has been studying the value of using social chatbots for public relations purposes, and how businesses can use them to build long-term, more personalized relationships with customers that in turn increase trust toward the company. In a two-stage study, Men and her colleagues reviewed the list of companies on the Fortune 100 list and then went to Facebook to see which ones had chatbots. “We tested them and rated them on a scale of 1 to 3” to find the five that performed well in different
industries. Domino’s Pizza is doing a great job, she said. They then asked more than 1,000 consumers to have a five-minute conversation with one of the five companies that was randomly selected for each person. Follow-up questions included whether the conversation sounded authentic, how they would evaluate the quality of the conversation and whether the chatbot affected their relationship with the brand. “Our hypothesis was supported,” and chatbots have value, Men said. “Social presence and the conversational human voice of the chatbot can improve their listening ability, which in turn will affect the public perception of the transparency of the company, their trust, and their satisfaction and commitment to the company.” When chatbots sound more human, including making conversations more positive by using humor and addressing questions, they help build relationships with the user, Men said. In the second stage, they are recruiting respondents who are customers for an experiment in which they
are controlling some conditions, such as whether conversations sound more like machines or humans. The goal will be to find out what type of messaging should be used to improve the relationship between companies and the public. “Companies are not always using AI effectively, which leads to negative effects” like chatbots not answering or sounding like a machine, thereby hurting relationships with customers, she said. But, there are ways to train chatbots to be effective communicators by working with IT people. “There are ways to make chatbots have a PR mindset. They are available 24/7 and are very responsive, so there are a lot of advantages and they can really reduce the human cost.” An effective conversation with a chatbot can improve public perception, Men said. When businesses are “upfront with new technology, are more trendy, more competent, more agreeable, friendly, it can help a relationship and enhance a positive image in the eyes of stakeholders.” Lenore Devore
Rita Men, associate professor of public relations, developed a model to find out how executives communicated with their employees during the pandemic.
Explore 37
E tracts Fighting Antibiotic Resistance Faster identification helps stop outbreaks
A
UF research team is using a $1.2 million National Science Foundation grant to develop a set of algorithms and an electronic interface that will allow public health investigators to test and analyze biological samples for antibiotic resistance in rural areas. Antimicrobial resistance (AMR), or the ability of an organism to stop an antibiotic from working, has become a serious threat to public health, resulting in microbial outbreaks becoming more frequent, widespread and severe. An estimated 2.8 million people per year in the United States are infected with resistant bacteria, and more than 35,000 of these infections are fatal. According to a 2016 report from the National Academy of Medicine, antimicrobials for livestock account for 80% of the antimicrobials purchased in the U.S. Feeding low doses of antibiotics to livestock causes them to grow bigger, faster and less expensively. The
fear is that this practice leads to antibiotic resistance in the livestock that is then passed on to humans. Christina Boucher, associate professor in the Department of Computer & Information Science & Engineering (CISE), is using the NSF funding to develop a system for real-time identification of AMR outbreaks, so doctors and public health officials can respond quickly. High-throughput sequencing technology has been proven to be effective in identification of AMR, but in the past neither the technology nor the analysis were portable, Boucher says. Advancements in sequencing technology have reduced the size of the devices used so they can fit into one hand, but analysis of the resulting data requires comparing millions or billions of DNA sequences. This analysis required highperformance computers that were often far removed from rural areas.
Christina Boucher
Boucher’s team is developing novel algorithms and interfaces for on-site, real-time detection of AMR using smartphones and tablets which can be used in areas remote from large data analysis centers. This project builds on previous work Boucher and her colleagues did with support from the National Institutes of Health (NIH). Boucher built
AI Conversations The UFII Seminar Series on AI is wildly popular
I
George Michailidis
38 Summer 2021
n 2020, when the University of Florida launched its artificial intelligence initiative sparked by the gift of the fastest supercomputer in higher education, the world was in the midst of a pandemic. Starting a conversation about new AI avenues for research and keeping it going might seem daunting under the circumstances, but UF Informatics Institute Director George Michailidis rose to the challenge. UFII began a series of AI seminars that, a year later, have showcased the breadth of AI research and expertise already on campus. “We already had experience with a seminar series for COVID-related work, and obviously that was virtual because we were already on lockdown, but that gave me the idea for an AI
outreach activity,” says Michailidis, who became the founding director of the Informatics Institute in 2015. The AI seminar series has been wellattended, with a Zoom gallery of 60-70 faces in some cases. But Michailidis says he was struck by the engagement with the material after the events. Each talk, roughly an hour and including a Q&A, was recorded and posted on the institute’s website. “These presentations reached a much larger audience after we posted them, with sometimes 400 or 500 viewers just a few weeks later,” Michailidis says. “Some faculty reported back that they had used a snippet of a recording to showcase a particular point in a class or in a talk. So this is archival material that can be used in
a novel bioinformatics framework, developing computer algorithms that provide rapid and space-efficient means for analyzing very large data sets to determine how AMR genes evolve, grow and persist in a system that has been affected by antibiotic use. Public health officials typically comb through huge amounts of bacterial genomic data in an effort to home in on the origin of a potentially drug-resistant outbreak. But genome sequence databases can reach a petabyte, or one quadrillion bytes of data, and analyzing that data pits researchers in a race against time before an outbreak spreads. Among the many researchers with whom Boucher is collaborating is Mattia Prosperi, an associate professor of epidemiology in the College of Public Health and Health Professions, who helped developed machine-learning models to predict AMR features and outbreak location history from newly processed gene sequences. To produce algorithms that can be used on ever-larger sets of data, Boucher and Prosperi developed a novel
means to create, compress, reconstruct and update very large graphs that display all possible outcomes. Prosperi established a process for querying the graphs of data, which helps locate the sources of pathogenic outbreaks and determine answers to other questions. This in turn facilitates the development of effective intervention methods that reduce resistant pathogens in agricultural and clinical settings.
“We are applying our methods to samples collected from both agricultural and clinical settings in Florida,” Boucher said. “Analysis of preliminary and new data will allow us to draw conclusions about the public risk associated with antimicrobial use in agriculture; the effectiveness of interventions used to reduce resistant bacteria; and the factors that allow resistant bacteria to grow, thrive and evolve.”
many other ways, and that has a magnifying effect.” Michailidis moderated all the talks and says he was struck by how many — with no prompting — touched on fairness and ethics in AI, an aspect that UF intends to embed in both education and research on AI. “This became a cross-cutting theme of the talks, and this is an area the university has identified as central to its work,” Michailidis says. The series will continue in the fall and feature the work of the 20 researchers who won awards from the AI Catalyst Fund, a $1 million pool open to all disciplines and provided by UF Research to jumpstart AI research. “The teams will have made some progress in their research this fall, and
it will be interesting to hear how they are engaging with AI,” Michailidis says. Michailidis uses the word “teams” intentionally, pointing out that AI research and the catalyst awards are highly synergistic. Faculty who lack AI expertise have begun to reach out to data scientists, and data scientists, in turn, have begun to reach out to faculty who can offer social science perspectives. As UF’s 100 new AI faculty hires trek to campus in the fall, Michailidis says he expects the Informatics Institute may be a first stop. “We will help them find collaborators, especially the junior researchers, so I’m looking forward to fruitful discussions with them,” Michailidis says. “That’s part of my job, to connect people.”
Michailidis, a statistics researcher, says his own research team has used the new supercomputer, and he has seen how its speed can change the landscape of research across campus. Supercomputing, he says, was possible before, but not at the speeds of the HiPerGator AI. “This powerful resource make it possible to go faster. While the computer is working on a problem, the researcher can keep thinking and looking forward,” Michailidis says. “This computer is a transformative factor.” To view the AI seminar series or other events sponsored by the UF Informatics Institute, go to https://informatics.research.ufl.edu. Cindy Spence
Explore 39
Malachowsky Hall
Artificial Intelligence — A Smart Investment for Florida
Malachowsky Hall for Data Science & Information Technology is a 263,000-square-foot academic building being constructed in the heart of UF’s Gainesville campus that will connect students and researchers from across disciplines and create a hub for advances in computing, communication and cyber-technologies with the potential for profound societal impact. The building — anchored by a gift from UF alumnus Chris Malachowsky as well as funding from the state — will focus on the application of computing, communication, and cyber technologies to a broad spectrum of areas, including health care, pharmacology, security, technology development, and fundamental science. The building will bring together researchers from the Herbert Wertheim College of Engineering, the College of Medicine, the College of Pharmacy and the UF Informatics Institute. The building will also house the coordinating center for the OneFlorida Clinical Research Consortium, a statewide research network and data trust established through the UF Clinical and Translational Science Institute in 2010 to help facilitate and accelerate health research in Florida. https://news.ufl.edu/2020/12/malachowsky-hall-/