3.2 Chang, Hung, Lee, Li, Lucena, Mardjoko, Tsarnakova, Wang, Wang, Zheng. Microbial Forensics: Identifying Bacteria and Yeast Using Ribosomal DNA Fingerprints
17
3.2 Microbial Forensics: Identifying Bacteria and Yeast Using Ribosomal DNA Fingerprints By Hannah Chang ’22, along with Cassandra K. Hung, Hyunkyung K. Lee, Katherine W. Li, Alejandro J. Lucena, C. Zora Mardjoko, Rositsa Tsarnakova, Jason Wang, Lisa Wang, Claire Zheng Note from the Editors
This Paper was granted permission to reside in the Sigma Journal courtesy of PA Governor’s School for the Sciences 2021 Journal. Abstract
Microorganism identification, specifically by using the 16S rRNA genomic region shared by bacteria, is applicable in various areas of our lives. Current methods that exist for identification are subject to various limitations. Our project focused on developing a computational tool that could aid in the identification of bacteria by analyzing their 16S rRNA gene sequence via PCR and restriction enzyme digests. Biopython was used to load data from the Ribosomal Database Project (RDP) for use in our comparison program. A possibility reducer algorithm identified matching bacteria sequences between the dataset and the test set based on the unique ribosomal DNA restriction fragment lengths. Our match results indicated functionality of the program, in addition to obtaining bacteria that had no fragment matches. The output data was manually validated to find false positive and false negative fragments that helped explain the results. Introduction Background
Microorganisms can be found ubiquitously in a variety of settings, and their identification enables the analysis of their function, structure, and uses in a comprehensible manner. This plays a crucial part in a variety of fields, including food contamination, food production, environmental cleanup, ecosystem biodiversity, and disease control and prevention, to name a few. In recent years, Salmonella has been found as a contaminant of a number of supermarket foods, including salad, ground turkey, and other frozen poultry [1]. Being able to quickly and accurately identify disease-causing agents like Salmonella is imperative to treating and monitoring the spread of such pathogens. Conversely, some microbes are critical to maintaining life. They modulate energy flow within ecosystems by acting as decomposers and are responsible for almost half the photosynthesis that occurs on Earth [6]. Microbes are also important in the production of food products such as cheeses, yogurts, and breads; for instance, modern-day yogurt production involves culturing the milk with live bacteria, including Streptococcus thermophilus and Lactobacillus bulgaricus, which produce lactic acid to thicken the yogurt [7]. How-
ever, understanding these phenomena would not have been possible without first knowing the identity of the microorganism being worked with. Whether it is studying infectious diseases, producing foods, or engineering sustainable technology, being able to identify unknown microorganisms is important for a number of scientific applications. Current methods for identification include denaturing gradient gel electrophoresis (DGGE), fluorescence in situ hybridization (FISH), clonal libraries, full genome sequencing, amplified ribosomal DNA restriction analysis (ARDRA), and terminal restriction fragment length polymorphism (T-RFLP). These techniques employ a variety of procedures and often have different efficacies depending on the microorganism that is being analyzed. For example, denaturing gradient gel electrophoresis (DGGE) relies on melting points of DNA fragments to track separation while clone libraries require additional phylogenetic comparison between the foreign organism and available data. Full genome sequencing is another alternative to identification, but it can take a while to produce. One of the more widely available procedures, amplified ribosomal DNA restriction analysis (ARDRA), allows for the identification of organisms through creating a DNA fingerprint of their rRNA [5]. Although ARDRA has the benefits of being relatively quick, inexpensive, simple to use, and available in most labs, there are drawbacks in that it is time-intensive since it relies on the manual matching of DNA fingerprints to a database. Additionally, organism identification is often a tedious task that may not always require the simplest of techniques; therefore, it is crucial that steps be taken in order to introduce an unchallenging and straightforward approach to discerning unknown microorganisms. 16S rRNA Gene
A 70S prokaryotic ribosome is composed of a large 50S subunit and a small 30S subunit, and the 30S subunit can be further divided into 21 proteins and a 16S rRNA molecule [7]. The associated 16S rRNA gene is of particular interest in studying bacteria taxonomy because it is highly conserved both structurally and functionally across prokaryotic organisms; however, there is still variation within 16S rRNA gene sequences that can be used as markers to differentiate between species [3]. Furthermore, the 16S rRNA gene is approximately 1,500 base pairs, which is sufficiently large enough to be used for research in computational informatics [2]. The entire prokaryotic