The Manhattan Scientist Series B Volume 5 Fall 2018
A journal dedicated to the work of science students
The Manhattan Scientist Series B
Volume 5
Fall 2018
ISSN 2380-0372
Student Editors Lilliana McHale Sarah Reese
Faculty Advisor and Editor in Chief
Constantine E. Theodosiou, Ph.D.
Manhattan College
4513 Manhattan College Parkway, Riverdale, NY 10471 Student Handbook manhattan.edu MANHATTAN COLLEGE • RIVERDALE, NY 10471
Series B
The Manhattan Scientist
Volume 5 (2018)
A note from the dean It is hard to believe we are already in the fifth volume of Manhattan Scientist’s Series B! The present volume includes thirty-five papers, covering most of the disciplinary subjects of the School of Science at Manhattan College, including interdisciplinary ones. These student-centered research activities continue to “teach through discovery,” and to be aligned with the College’s mission, to “provide a contemporary, person-centered educational experience that prepares graduates for lives of personal development, professional success, civic engagement, and service to their fellow human beings.” The quality of the projects continues to be exceptional, considering that these students were in high school only two to three years ago. I would like to express our continuing gratitude to the several faculty members who willingly provided critical mentoring to our students and future colleagues, with minimal or no compensation for these efforts. This work continued to receive critical financial support for our students from a variety of sources (in no particular order): the School of Science Research Scholars Program, the Jasper Scholars Program of the Office of the Provost, the Catherine and Robert Fenton endowed chair in biology, the Linda and Dennis Fenton ’73 endowed biology research fund, the Michael J. ’58 and Aimee Rusinko Kakos endowed chair in science, Jim Boyle ’61, Kenneth G. Mann ’63, and a National Science Foundation research grant. On the cover of this issue is an image from the paper by Claudia Ramirez. We hope that the elegance we experience in nature is also reflected in the work presented by our students. The last article in this volume, written by a student majoring in English with a minor in biology, deviates from scientific research per se and provides important guidance to future authors of how to properly present their research in a scientific manuscript. I would like to express my deep appreciation to the students for their efforts and their persistence on the road of scientific discovery. Like in the previous volumes, the task of typesetting so many articles in LaTeX has been arduous, but well worth the effort to recognize the researchers’ work. We are all honored to showcase our students’ and colleagues’ continuing contributions to the body of scientific knowledge. It is with great pleasure that we present the publication of Series B, Volume 5, of The Manhattan Scientist.
Constantine Theodosiou Dean of Science and Professor of Physics ISSN 2380-0372
Series B
The Manhattan Scientist
Volume 5 (2018)
Table of Contents Biochemistry Investigating Bacillus subtilis biofilm formation stimulated by pants Alexis Brown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Building a library of mutant H2AZ histones to observe variances in H2A vs. H2AZ interactomes Shereen Chaudry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Establishing a chromatin immunoprecipitation protocol (ChIP) to use with unnatural amino acid cross-linking Ellen Farrelly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Characterization of the role of bacterial tyrosine kinases in Bacillus subtilis biofilms Tameryn Huffman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Investigating the role of the Med protein in biofilm formation Juan Lara-Garcia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Purification and isolation of glucose oxidase for use in a bio-battery Monique Ng . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Biology Xylem conductivity of primary, secondary, and tertiary veins of plant leaves Maya Carvalho-Evans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Geographic variation in body size and sexual size dimorphism of North American ratsnakes Alexander J. Constantine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Bark formation for Neobuxbaumia tetetzo and Pachycereus hollianus Phillip Dombrovskiy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Predicting mortality rates for saguaro cacti (Carnegiea gigantea) Cole Johnson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Predicting rates of bark coverage on saguaro cacti (Carnegiea gigantea) George Kennedy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 Scaling relationships between leaf areas and leaf veins for percurrent leaves Gina Leoncavallo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Machine learning algorithms to determine bark coverage on saguaro cacti (Carnegiea gigantea) Marissa LoCastro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Determination of the temporal and spatial variation of Giardia lamblia in ribbed mussels from Bronx, NY Alexa Marcazzo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Bark and spine analyses of Neobuxbaumia mezcalaensis and Pachycereus hollianus Catherine McDonough . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Predicting bark coverage on saguaro cacti (Carnegiea gigantea) Olivia Printy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Growth dynamics of Artemisia tridentata Claudia Ramirez . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Xylem conductivity in stems of Artemisia tridentata Victoria Webb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Series B
The Manhattan Scientist
Volume 5 (2018)
Chemistry Theoretical and experimental design of efficient polycyclic aromatic hydrocarbon adsorbents Jeovanna Badson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Carcinogenic nature of polyaromatic hydrocarbon binding to DNA Jacqueline DeLorenzo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Removing polycyclic aromatic hydrocarbons by adsorption onto silica gels treated with lipophilic carboxylic acids Jessi Dolores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 An insoluble chemical reducing agent: Application to Cr(VI) removal Nicholas Dushaj . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Core-shell nanoparticles as photocatalysts to purify water Hannah Mabey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Enhancing enzymatic fuel cells with nanotechnology Seth Serrano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Solar cells using nanowire technology Francisca Villar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 Creating lead-free perovskite materials for a cleaner, greener future Amanda Zimnoch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Computer Science Parallel GPU based simulations of multilayer neural networks with multi-valued neurons James S. Abreu Mieses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 Speed up of big data encryption on GPU using CUDA Zi Xin Chiu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 Implementation and evaluation of LPN-based authentications Eric Ciccotelli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 Intelligent edge detection using a MLMVN Josh Persaud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Environmental Science Reconstruction of atmospheric CO and the stable isotopes δ 13 C and δ 18 O over the last 8,000 years Sophia Misiakiewicz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 Analyzing the concentration of atmospheric CO derived from biomass burning during 1700-1800 AD Peter Parlato . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
Physics Observable relics of a simple harmonic universe Peter Gilmartin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 Comparison of Monte Carlo generators for Higgs decay processes Sarah Reese . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
Science writing style Writing science research manuscripts for publication Lilliana McHale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
On the cover: Image of a reproductive branch with seeds of Artemisia tridentata enlarged 10×. (From Ramirez, p. 156)
Investigating Bacillus subtilis biofilm formation stimulated by plants Alexis Brown∗ Department of Chemistry and Biochemistry, Manhattan College Abstract. Bacillus subtilis is a soil-dwelling bacteria that forms biofilms on the roots of plants. Previous research shows that when B. subtilis forms a biofilm on a plant root, it promotes plant growth and protects the plant from pathogens. It is unclear how the plant signals to B. subtilis to colonize and form a biofilm, but root extract and exudate have been shown to stimulate B. subtilis biofilm formation. The goal of this study was to characterize the chemical signals present in root extract that stimulate B. subtilis biofilm formation. Further, we sought to explore whether these chemicals are found in a variety of plant roots or are specific to tomato plants. Scallion, spinach, potato, and ginger root samples were obtained by purchasing plants or, for tomatoes, growing plants from seeds. The concentrated samples were passed through a SepPak C18 column resulting in the collection of flow through and wash eluates. These fractionated samples were then used in luciferase assays which tested the activation of the B. subtilis biofilm pathway via the transcription of the genes sdpA and tasA following the activation of the master regulator, Spo0A. Biofilm pellicle assays were also performed as a form of qualitative analysis. Root extracts were added to B. subtilis strain 3610 in a lysogeny broth, a media which does not support biofilms, and the resulting biofilm was compared to the thick, wrinkly pellicle of B. subtilis in a biofilm supporting media, LBGM. Our results show that all the plant samples had some level of stimulation which was seen in the luciferase assays. The particular fraction with the greatest stimulation remains unclear as it varied between the different plants. The intensity of this stimulation also varied, which was revealed in the biofilm assays with ginger showing the most developed biofilms.
Introduction Humans obtain many essential nutrients from plants in the form of fruits and vegetables. Gardens and farms grow millions of plants, representing thousands of species throughout the world. Unfortunately, humans are not the only species in search of the nutrients that plants can provide. We are in competition with other animals, fungi, and bacteria, including pests that may cause disease in the plants. As a result, scientists in a variety of fields are working to develop methods to protect plants from pathogens in order to grow more viable plants to feed our growing global population. Today, many farms use pesticides to ward off pests that pose a risk to human crops. However, many of these pesticides contain chemicals that can be poisonous to humans as well as to the pests they are meant to prevent. Thus, scientists are constantly searching for safer alternatives for plant pest control. One potential solution involves the use of soil bacteria, like Bacillus subtilis, that can be stimulated to congregate into colonies on the roots of plants (Chen et al., 2013). These bacterial communities, or biofilms, have been shown to induce a systemic response in the plant that then protects it from pathogens. This makes B. subtilis a natural plant protectant that can be marketed as a biocontrol agent for agricultural use. Ultimately, research that leads to a better understanding of the mechanism by which B. subtilis forms biofilms on plant roots might have ∗
Research mentored by Sarah Wacker, Ph.D.
2
The Manhattan Scientist, Series B, Volume 5 (2018)
Brown
major implications, including more controlled biofilm formation on plants, improved protection of plants against pathogens, and advancements in agriculture. Despite the extensive research on the genetics of B. subtilis, it is still uncertain how exactly these bacterial communities are formed (Cairns et al., 2014). It is believed that sporulation of B. subtilis begins with recognition that nutrients are limited. This leads to the activation of histidine kinases which then initiate a phosphorrelay, resulting in phosphorylation and activation of the protein Spo0A. Spo0A serves as the primary transcription factor involved in the initiation of both sporulation and biofilm formation (Fujita and Losick, 2005). Exactly how the histidine kinases are first activated, however, remains unclear. Previous research showed that biofilm formation can be triggered by tomato plant root extract and root exudate, and that this activation is through one of the histidine kinases which then activates Spo0A (Chen et al., 2012). Other studies have identified polysaccharides to play a major role in biofilm formation. According to Beauregard et al. (2013), certain polysaccharides, found in the plant cell wall, are important for biofilm formation. It is of particular interest to the field to characterize the interaction between B. subtilis and plants. These studies strive to provide a deeper understanding of how B. subtilis biofilms are formed by investigating the chemical signaling and metabolic processes involved.
Purpose The goal of this research is to determine if biofilm formation is triggered by a general metabolic state or by a signal from a small molecule specific to this role. In doing this, we also hope to identify which small molecules aid in biofilm formation and if the same molecules vary between different plant species. In order to investigate this question, I will test extract from a variety of plants to determine if all of them can stimulate biofilm formation or whether, this stimulatory activity is specific to some plants.
Materials and Methods Collection of root samples Tomato plant preparation was based on published procedure (Chen, et al., 2012). Briefly, forty tomato plants were grown from surface-sterilized seeds in a greenhouse. Plants were grown in soil and watered three times daily. After nine weeks, tomato plants were harvested by removing all soil from the roots of the plants. Plant roots were soaked in sterile water for 72 hours to collect tomato exudate before the roots were removed from plants and pulverized with a mortar and pestle. Root extract samples were filter sterilized with several filters that ranged from 1 Âľm filter to 0.22 Âľm filter. All root samples were stored at 4C. Scallion, spinach, potato, and ginger samples were all purchased from grocery stores. For scallion, the small roots at the base of the plant were washed and removed from the rest of the plant. A small amount of sterile water was added to the small scallion roots before they were crushed with a mortar and pestle. The resulting scallion extract was later concentrated using a rotary evaporator. Spinach roots were washed, cut into pieces, and macerated with a mortar and
Brown
The Manhattan Scientist, Series B, Volume 5 (2018)
3
pestle. One whole potato was used for the potato sample. The potato skin was first shaved with a scalpel before being processed through a juicer. The skin of the ginger was removed in a similar manner. Two pieces of ginger were used on separate days to create two unique samples. Each piece was cut into smaller pieces and then macerated with a mortar and pestle. All root samples were passed through increasingly small filters, finishing with a 0.22 µm filter, after which the samples were judged to be sterile. All root samples were stored at 4C. Column chromatography As shown in Fig. 1, each extract sample was fractionated into polar and nonpolar eluates using a C18 Sep pak column (Waters). The polar fractions, the flow through and wash, were eluted first from the column. Various percentages of methanol were also run through the column to collect any nonpolar fractions.
Figure 1. Illustration of the fractionation process using the C18 column. Polar fractions included the flow through and wash. Nonpolar fractions included 10-100% methanol.
Luciferase assay B. subtilis 3610 containing the genetic reporters P sdpA -luciferase (called CY136) or PtasA luciferase (called ALM91) was cultured from 1 day old colonies and grown in LB at 37C for ∼3 hrs (to mid-log-phase) and then diluted 1:100 with fresh LB media. 200 µL of diluted cells were added to each well of a 96-well plate (white with a clear bottom). Root samples were added at the indicated concentrations (1% = 2 µL or 2% = 4 µL). Luminescence and optical density readings were taken every 10 minutes for 14 hours in a FilterMax F5 multi-mode plate reader (Molecular Devices) at 37C, with shaking in between readings to support cell growth. To analyze data, luminescence readings were normalized against cell absorbance at 600 nm for each time point. Data shown are representative results from at least two different experiments. An example curve with pyruvate is shown in Fig. 2.
The Manhattan Scientist, Series B, Volume 5 (2018)
Luminescence/Aborbance
4
Brown
B. subtilis Pyruvate
10000
5000
0 0
200
400
Time (min)
600
800
Figure 2. Example of a graph of expression of PsdpA-luciferase reporter with the addition of pyruvate, a known biofilm stimulant. •, B. subtilis; •, pyruvate. The normalized luminescence of pyruvate is greater than that of B. subtilis, indicating biofilm stimulation.
Pellicle biofilm assay Biofilm assays were also used as a method to obtain qualitative data to complement the quantitative data from the graphs of the luciferase reporter assays. This was done by inducing pellicle formation with the various root extract samples and comparing the resulting pellicle that formed at the air-liquid interface. B. subtilis 3610 was cultured from 1 day old colonies and grown in lysogeny broth (LB) at 37C for 3 hrs (to mid-log-phase) and then diluted 1:1000 in 8 mL media that does not support biofilm (LB). Media was supplemented with 2% root extract (40 µL to 8 mL) or as a positive control, a combination of 1% glycerol and 0.1 mM MnSO4 (Fig. 3). Biofilm cultures were incubated at 30C for 48 hours before images of the cultures were taken. Biofilms were then judged based on thickness, opacity of the film, and the presence or lack of wrinkles.
Figure 3. B. subtilis pellicle biofilm assay showing the experimental controls. Left, Negative control: Lysogeny broth (LB) + 3610 strain of B. subtilis. Right, Positive control: Lysogeny broth + 1% glycerol and 0.1 mM manganese sulfate (called LBGM) + 3610 strain of B. subtilis.
Brown
The Manhattan Scientist, Series B, Volume 5 (2018)
5
Results
3000
10000
Luminescence/Absorbance
Liminescence/Absorbance
Tomato B. subtilis biofilms have been studied in tomato plants in recent studies (Chen et al., 2012; Chen et al., 2013). Thus, this study featured tomato plants in hopes of also seeing stimulation of biofilm formation with tomato extract. Our first assay for biofilm stimulation relies on the promoter of a gene, sdpA, which is involved in toxin production in B. subtilis. Importantly, sdpA is also under the transcriptional regulation of Spo0A and thus if biofilm formation is activated through the Spo0A pathway, we expect to see increased expression of the sdpA gene. Fig. 4 shows the effect of adding 2% tomato root extract to an sdpA luciferase reporter assay (strain CY136). After about 500 minutes, the fractionated tomato extracts rise to a peak maximum while B. subtilis alone begins to decline. The extracts never rose above the highest peak of the B. subtilis control, however. Unfortunately, tomato extract on the CY136 strain did not seem to show consistent results when this experiment with the spdA reporter was repeated in later weeks.
5000
0 0
200
400
600
800
Time (min)
Figure 4. PsdpA luciferase reporter assay with tomato root extract input, flow through, and wash.
2000
1000
0 0
100
200
300
400
500
Time (min)
600
700
800
Figure 5. PtasA -luciferase reporter assay with tomato root extract input, flow through, and wash.
B. subtilis; B. subtilis + 2% tomato extract input;
B. subtilis + 2% tomato extract FT;
B. subtilis + 2% tomato extract wash.
We also examined the effect of tomato root extract on a second luciferase reporter, based on the promoter of the gene tasA, which is a component of biofilm formation and also under the transcriptional control of Spo0A. Our results with the tasA-luciferase reporter (strain ALM91) yielded results that deviated from what was seen with the sdpA reporter assay. The tasA luciferase reporter assay repeatedly showed an increase in stimulation upon the addition of the tomato extract. This is evidenced in Fig. 5, in which the highest peak, between 300-400 minutes is from the extract input. Further, the maximum point of the extract, at around 350 minutes, exceeds the maximum of the reporter alone. The root extract was fractionated with a C18 column and the subsequent flow through and wash were assayed as well. The results of both reporter assays revealed that the input and flow through stimulated B. subtilis more than the wash (Figs. 4, 5, 6).
6
The Manhattan Scientist, Series B, Volume 5 (2018)
Brown
Finally, pellicle biofilm assays were performed on each of the fractions using the wild type 3610 strain of B. subtilis. Unlike the reporter assays, this is a direct assay of whether the plant samples could stimulate biofilm formation. The fractionated extract did in fact yield biofilms on the surface of both LB (Fig. 6) and Msn media (data not shown). Compared to the positive control that used the highly supportive media, LBGM, the biofilms that developed from the other media were not nearly as thick or wrinkly.
Figure 6. Pellicle biofilm assay on wild type B. subtilis 3610 with tomato root extract input, flow through, and wash in LB media. LB
LB + Flow LB + Input Through
LB + Wash
Figure 6 Pellicle biofilm assay on wild type B. subtilis 3610 with tomato root extract input, flow through, and wash in LB media.
Scallion The scallion plant was a promising plant species to investigate because of its bulbous roots B. subtilis biofilms have been studied in tomato plants in recent studies and accessibility. After repeated trials our results and didChen notetseem toThus, resemble thatplants we in had al, 2013). this studyanything featured tomato hopes of also of biofilmthere formation with tomato extract. trend Our firstin assay for biofilm stimulatio seen in the assays with the tomato extract. For example, was a consistent which the promoter of a gene, sdpA, which is involved in toxin production in B. subtilis. extract sample developed a delayed second peak.is This behavior was illustrated both, also under the transcriptional regulation ofin Spo0A and CY136 thus if biofilm forma through the Spo0A pathway, we expect to see increased expression of the sdpA and ALM91 strains of B. subtilis (Figs. 7 and 8, respectively).
shows the effect of adding 2% tomato root extract to an sdpA luciferase report CY136). After about 500 minutes, the fractionated tomato extracts rise to a pea B. subtilis alone begins to decline. The extracts never rose above the highest p
Luminescence/Absorbance
Luminescence/Absorbance
2000
10000
5000
0
1000
0 0
200
400
Time (min)
600
800
Figure 7. P sdpA -luciferase reporter assay with scallion root extract B. subtilis;
B. subtilis + 2% scallion extract input;
0
200
400
Time (min)
600
800
Figure 8. PtasA -luciferase assay with scallion root extract input and wash B. subtilis + 2% scallion extract FT;
The results changed after the fractionation of the scallion sample. In both the sdpA (Fig. 7) and tasA reporter assays (Fig. 8), the scallion extract wash developed a delayed peak after about 500 minutes. In the sdpA assay this delayed peak is almost as high as that of the B. subtilis control.
Brown
The Manhattan Scientist, Series B, Volume 5 (2018)
7
These luciferase assays were followed by a biofilm assay. The scallion extract samples did not yield as strong of a biofilm as the control. However, between the different scallion fractions, it was Figure 8. PtasA -luciferase with scallion extract input and wash evident that the wash fraction was wrinklier thanassay the input and flowroot through samples (Fig. 9).
LB + Input
LB + FT
LB + Wash
Figure 9. Pellicle biofilm assay on wild type B. subtilis 3610assay with scallion roottype extract Figure 9. Pellicle biofilm on wild B.input, flow. through, and wash in LB media subtilis 3610 with scallion root extract input,
flow through, and wash in LB media
2000 subtilis control. These luciferase assays were followed by a biofilm assay. The did not yield as strong of a biofilm as the control. However, between t fractions, it was evident 1000 that the wash fraction was wrinklier than the i samples (figure 9).
10000
5000
Luminescence/absorhance
Luminescence/Absorbance
Spinach Spinach root extract was also investigated and assayed. The sdpA luciferase reporter assay The scallion plant was a promising plant species to investigate showed results similar to those seen with the scallion extract. In the sdpA assay, a delayed peak roots and accessibility. repeated trials ourofresults did group not seem to developed after about 500 minutes. Although this After peak rose above the trend the control seen incontrol the assays with the tomato example, there was at that time interval, thehad B. subtilis had a much greater peak extract. earlier in For the assay (Fig. 10). whichshowed the extract sampleatdeveloped delayed second In contrast, the tasA results no activation all. This is aevident in Fig. 11 inpeak. whichThis the behav CY136 andaALM91 strains of B. subtilisthroughout (figures 7theand 8, respectively sample with spinach extract showed lower normalized luminescence entire assay. Fractionation of the spinach extract not result in anyafter notable The complementary The did results changed thedifference. fractionation of the scallion samp biofilm assay for spinach extract7) didand nottasA produce a strongassays biofilm.(figure There was clearly a filmextract at the wash (figure reporter 8), athe scallion surface but it was not very thick and had no wrinkles (Fig. 12). after about 500 minutes. In the sdpA assay this delayed peak is almost
Spinach
0
0 0
200
400
Time (ns)
600
800
Figure 10. PsdpA -luciferase assay with spinach root extract input, flow through, and wash
0
200
400
Time (min)
600
800
Figure 11. PtasA -luciferase assay with spinach root extract input, flow through, and wash
B. subtilis; B. subtilis + 2% spinach extract input;
B. subtilis + 2% spinach extract FT;
B. subtilis + 2% spinach extract wash.
8
The Manhattan Scientist, Series B, Volume 5 (2018)
Brown
Figure 12. Pellicle biofilm assay on wild type B. subtilis 3610 with spinach root extract input in LB media.
Potato Although potato tubers are not actually roots, their thick starchy bodies grow underground and could potentially be a surface for biofilms the soil-dwelling B. subtilis. The potato tubers were blended into an extract and assayed. In the sdpA luciferase reporter assay, the samples with potato extract show strong stimulation after 300 minutes, as evidenced by the peaks in Fig. 13. However, a parallel trial did not yield a strong distinction between the potato extract and the control. It is also interesting to note that the wash fraction yielded the greatest peaks in both trials. The tasA luciferase reporter assay did not show the same results; there was no major distinction between the sample with and without potato extract. The same results came of the tasA assay involving the fractionated potato samples (Fig. 14). The biofilms that derived from the potato extract samples were quite thick but did not develop any wrinkles (Fig. 15). 2000
10000
5000
0 0
200
400
Time (min)
600
800
Figure 13. PsdpA -luciferase assay with potato root extract input, flow through, and wash
Luminescence/Absorbance
Luminescence/Absorbance
15000
1000
0 0
200
400
Time (min)
600
800
Figure 14. PtasA -luciferase assay with potato root extract input, flow through, and wash
B. subtilis; B. subtilis + 2% potato extract input;
B. subtilis + 2% potato extract FT;
B. subtilis + 2% potato extract wash.
Ginger Similar to the potato tubers, ginger is not actually a root but rather a rhizome that can grow above or just beneath the surface of the soil. Thus, it was possible to explore its ability to stimulate B. subtilis biofilms. Both luciferase assays showed significant stimulation of B. subtilis after the
Brown
The Manhattan Scientist, Series B, Volume 5 (2018)
9
Figure 15. Pellicle biofilm assay on wild type B. subtilis 3610 with potato root extract input in LB media.
addition of the ginger extract. There seemed to be a common characteristic of a delayed peak that came close or even exceeded the maximum of the control group. In the sdpA luciferase assay, this delayed peak did not occur until after about 500 minutes (Fig. 16). The tasA luciferase assays yielded similar results (Fig. 17). Earlier trials showed that the samples with ginger extract were not very distinct from the plots of the control while later trials revealed the sharp, yet delayed peaks similar to those of the sdpA assays.
Luminescence/Absorbance
Luminescence/Absorbance
3000
5000
0
2000
1000
0
0
200
400
600
800
Time (min)
Figure 16. PsdpA -luciferase assay with ginger root extract input, flow through, and wash
0
200
400
Time (min)
600
800
Figure 17. PtasA -luciferase assay with ginger root extract input, flow through, and wash
B. subtilis; B. subtilis + 2% ginger extract input;
B. subtilis + 2% ginger extract FT;
B. subtilis + 2% ginger extract wash.
Luciferase assays were repeated after the extract was fractionated and the results of these trials were inconsistent. Although all the fractions stimulated B. subtilis and showed a peak after 500 minutes, the magnitude of the luminescence varied. In some trials the input and flow through developed into the greatest peak and in others it was the wash fraction (data not shown). A biofilm assay for the ginger sample was also used and showed results that complemented the results of the luciferase assay. The biofilms that developed were thick layers of deep wrinkles (Fig. 18). These biofilms were comparable to those of LBGM, the positive control (Fig. 3). Upon comparison of the biofilms of the different fractions, it is evident that the strongest biofilms derived from the ginger extract input, rather than the flow through and wash (Fig. 18).
10
The Manhattan Scientist, Series B, Volume 5 (2018)
Brown
Figure 18. Pellicle biofilm assay on wild type B. subtilis 3610 with ginger root extract input in LB media.
Discussion Our results have answered many of our original questions that were posed. First, because there was some level of stimulation, particularly in the luciferase assays, in each of the plant samples, we concluded that there are likely common chemical(s) within all plants that provide some sort of signal for B. subtilis to form biofilms. However, the intensity of this stimulation did vary greatly between plants. While spinach, potato, and ginger yielded strong peaks that were distinct from our control and formed thick, wrinkly, pellicles in the biofilm assay, the results from spinach and scallion were not as significant. Future research would attempt to identify what the particular chemicals are and if the varying concentrations in each of the plants may have contributed to the varied levels of stimulation. Because our experimental procedure was modeled off of the work of Chen et al. (2012) we expected our results to be similar to his, i.e. thick biofilms with profound wrinkles. Our tomato results did not align with published data as well as expected despite using several different tomato samples. We used both old and new samples of tomato root extract and exudate. The old samples performed relatively well in the first few luciferase assay but, as time went on, it seemed that the ability of the extract to stimulate biofilms decreased. Again, this was evident in both the luciferase and biofilm assays. The results of the tomato exudate and extract that was developed during the nine-week period did not show high levels of stimulation either, unfortunately. We noted that the roots took on an odd smell after soaking in deionized water for three days during the exudate collecting process. It is possible that the tomato samples were contaminated by some foreign elements in the lab or in the water that was used. This is a possible contributing factor for our inconclusive results. One feature of our results that did coincide with the previous research by Chen et al. (2012) was that tomato exudate consistently showed lower levels of stimulation than the extract. This encourages the conclusion that the molecules needed to recruit B. subtilis to form biofilms are more concentrated in the roots rather than the chemicals within the rhizosphere. Overall, it seemed that the better results yielded from the input and flow through fractions of the tomato extract, rather than the wash. This is evident in both CY136 and ALM91 reporter assays and the biofilm assays. From this, we can hypothesize that the chemicals involved in biofilm
Brown
The Manhattan Scientist, Series B, Volume 5 (2018)
11
stimulation with tomato plants are most active in their most concentrated forms. Between the other plants, the fractions with the highest stimulation varied. Like tomato, the greatest stimulation in ginger came from the input and flow through. In the scallion assays, it seemed that wash stimulated B. subtilis biofilms the best. We were most impressed with the results of the ginger samples. This was the only plant sample that showed consistent biofilm stimulation in both the luciferase assays and the biofilm assays. The ginger samples that were used were not roots but rather rhizomes, which are underground stems that can be found just under the soil. This location makes it accessible to B. subtilis, a soil-dwelling bacteria. Thus, we hypothesize that the ginger rhizome contains a high concentration of molecules that can stimulate B. subtilis biofilms. It also leads us to question if other plant parts, not just root systems have this stimulatory effect. A few questions derived from the methods that were used in this study as well. First, The PsdpA -luciferase reporter assay seemed to agree with the results of the biofilm pellicle assay better than PtasA -luciferase reporter. Perhaps this means that expression of sdpA is more indicative of a group of bacteria’s likelihood of forming a biofilm, even though it is not directly in the biofilm pathway, than the tasA operon is. Another question is which areas of the graphs of normalized luminesce are most indicative of B. subtilis biofilm stimulation. In our studies, we assumed any increase in luminescence over the control indicated an increase in stimulation, regardless of if this increase occurred early or late in the assay. Future research to define whether strong biofilm stimulation is typically illustrated as increases in luminescence early, later, or for extended periods of time in luciferase assays would be useful.
Acknowledgment This work was supported by Manhattan College Jasper Scholars Program
References Beauregard, P. B., Chai, Y., Vlamakis, H., Losick, R., and Kolter, R. (2013). Bacillus subtilis biofilm induction by plant polysaccharides. Proceedings of the National Academy of Sciences of the United States of America, 110(17), E1621–E1630. http://doi.org/10.1073/pnas.1218984110 Cairns, L. S., Hobley, L., and Stanley-Wall, N. R. (2014). Biofilm formation by Bacillus subtilis: new insights into regulatory strategies and assembly mechanisms. Molecular Microbiology, 93(4), 587–598. http://doi.org/10.1111/mmi.12697 Chen, Y., Cao, S., Chai, Y., Clardy, J., Kolter, R., Guo, J.H., and Losick, R. (2012). A Bacillus subtilis sensor kinase involved in triggering biofilm formation on the roots of tomato plants. Molecular Microbiology, 85(3), 418-430. http://onlinelibrary.wiley.com/doi/10.1111/j.13652958.2012.08109.x/abstract Chen, Y., Yan, F., Chai, Y., Liu, H., Kolter, R., Losick, R., and Guo, J. (2013). Biocontrol of tomato wilt disease by Bacillus subtilis isolates from natural environments depends on
12
The Manhattan Scientist, Series B, Volume 5 (2018)
Brown
conserved genes mediating biofilm formation. Environmental Microbiology, 15(3), 848–864. http://doi.org/10.1111/j.1462-2920.2012.02860.x Fujita, M., and Losick, R. (2005). Evidence that entry into sporulation in Bacillus subtilis is governed by a gradual increase in the level and activity of the master regulator Spo0A. Genes & Development, 19(18), 2236–2244. http://doi.org/10.1101/gad.1335705
Building a library of mutant H2AZ histones to observe variances in H2A vs. H2AZ interactomes Shereen Chaudhry∗ Department of Chemistry and Biochemistry, Manhattan College Abstract. Histones are part of the most basic pattern of DNA compaction and are therefore essential components of gene expression regulation during transcription. Histone variant H2AZ is implicated in the destabilization of the nucleosome structure, particularly at active promoter sites. In this study, we aim to create a library of HTZ1 (the gene for H2AZ) mutated plasmids containing the genome expansion codon for pBPA unnatural amino acid addition. Using this library, we cross-linked and analyzed in comparison to conical H2A using western blot analysis. H2AZ was also quantified comparatively to H2A using densitometry. We successfully created 11 mutated H2AZ histone expression vectors and obtained a preliminary crosslinking pattern showing subtle differences between H2A and H2AZ links. Future work includes mutation of at least 90 more sites and the development of a complete interactome.
Introduction Within our chromosomes, there are millions of miles of DNA, compacted into the six nanometer wide nucleus. This occurs in levels of organization, the most basic of which is the repeating nucleosomal unit. This unit consists of a histone octamer made up of four histones: H4, H3, H2A, and H2B. H2A is one of the core histones that stabilizes the nucleosomal structure. H2AZ is a variant of this protein and can replace the conical version in some structures. H2A conical has a variant form, H2AZ, which is implicated in the destabilization of the nucleosome and active promoter site when acetylated. This has been connected to oncogenic activation in prostate cancer. Similar to conical H2A, H2AZ has been remarkably conserved throughout evolution, and is present in most eukaryotic cells. H2AZ has been shown to facilitate DNA accessibility during transcription, which increases gene expression [1]. In this study, we aim to illuminate the dynamic differences between H2A and H2AZ by studying how each alters the nucleosomal architecture. We achieve this by using a crosslinking unnatural amino acids to establish and compare the in vivo histone- protein interactome at the surface of these variant nucleosomes.
Materials and Methods PCR amplification of HTZ1 PCR reactions were set up for the amplification of the yeast HTZ1 gene (gene encoding H2AZ) plus flanking regions of ±450 bps of genomic DNA. Primers were designed to have 5’-restriction sites SacI and XhoI for cloning into the pRS426 expression vector. Temperature cycles were set at 98◦ C for 15 s (denaturation), 55◦ C for 30 s (annealing), and 72◦ C for 60 s (extension). An initial ∗
Research mentored by Bryan Wilkins, Ph.D.
14
The Manhattan Scientist, Series B, Volume 5 (2018)
Chaudhry
denaturation period was extended to 1 minute, while the final extension time was extended to 2 minutes. DNA product was purified using the GeneJet PCR Purification Kit (Thermo, K0701), following the instructions included. 40 ÂľL water was used in the place of elution buffer.
Figure 1. Crystal structure of nucleosome with positions of interest [PDB: 1ID3]
Agarose gel electrophoresis PCR products and digested DNA were analyzed via electrophoresis on a 1% agarose gel in 1x TAE buffer. The gel was run at 100 V on constant voltage for 60 minutes. Imaging was done using auto exposure and UV 302 setting, on a c600 Azure bioimager.
Chaudhry
The Manhattan Scientist, Series B, Volume 5 (2018)
15
Double digest and ligation reaction of insert and vector pRS426 plasmid was digested with Xho1 and two-fold excess of Sac1 restriction enzymes in tango buffer. The PCR amplification product was also digested with the same enzymes. Each reaction was incubated at 37◦ C for 1 hr. DNA products were separated on a 0.8% agarose gel at 75 V constant for 1.5 hr. Products were isolated from the gel and purified using the GeneJet gel extraction kit (Thermo, K0691). Enzyme treated fragments were ligated in a 1:3 molar ration (vector:insert) using 75 µg vector. Ligations were transformed into competent cells via standard heat shock procedures. Preparation of calcium competent cells Competent cells were prepared to make cells that would retain our modified amino acid plasmid. Cells were streaked onto LB plate from frozen glycerol stock. DH10B cells were used. These were grown overnight at 37◦ C. 1 L LB, 1 L of 100 mM MgCl2 , 1 L of 100 mM CaCl2 , 100 mL of 85 mM CaCl2 with 15% glycerol, 4 centrifuge bottles, and microcentrifuge tubes were autoclaved to sterilize, then chilled overnight. Cell colonies were transferred to liquid LB media, in 5 test tubes with 5 mL each. These were grown in shaker at 37◦ C overnight. 0.5 L of LB was inoculated with 10 mL of starter culture and grown in shaker. OD600 is measured hourly until it reached 0.35-0.40. Cells were placed on ice and chilled for 20-30 minutes. Centrifuge bottles were placed on ice and 0.5 L culture was split into two parts of 250 mL. Cultures were pelleted by spinning in centrifuge for 15 minutes at 4000 rpm at 4◦ C. Supernatant was decanted and each pellet was resuspended in 50 mL of ice cold MgCl2 . Cells were harvested by centrifugation again, and decanted, then resuspended in 100 mL of cold CaCl2 . This was chilled on ice for 20 minutes. Cells were harvested by centrifugation at 3000 rpm for 15 minutes at 4◦ C. A 50 mL conical tube was rinsed with ddH2 O and chilled on ice. Supernatant was decanted and pellet was resuspended in 25 mL of ice cold 85 mM CaCl2 with 15% glycerol. Solution was transferred to conical tube. Pellet was collected by centrifugation resuspended in 2 mL of 85 mM CaCl2 , then aliquoted into sterile 1.5 mL microcentrifuge tubes. Cells were stored in the -80◦ C cryofreezer. Transformation into Competent E. Coli Cells 5 µL of DNA ligation product was added to 50 µL of competent cells. Solution was kept on ice for 30 minutes, then heat shocked for 1 min 15 s at 42◦ C. Solution was returned to ice for 5 minutes, and 400 µL LB was added to each and gently mixed. Cells were incubated at 37◦ C for one hour and then plated on ampicillin agar plates. Individual clones were grown and DNA was isolated using the GeneJet plasmid purification kit (Thermo, K0502). Instruction manual was followed, except 50 µL H2 O was used for elution rather than elution buffer. Isolated DNA was prepared and sent to sequencing in order to verify results.
16
The Manhattan Scientist, Series B, Volume 5 (2018)
Chaudhry
Quick-change addition of HA tag and unnatural amino acid (Repeated for each site) Following the correct construction of H2AZ expression vector, quick-change inverse PCR reactions were set up using site specific forward and reverse primers, to introduce TAG mutations at specified codons, or c-terminal HA coding sequence. Following PCR, the reaction was treated with Dpn1. DNA was then transformed into competent cells and plated on ampicillin plates. DNA was purified from individual clones and sequenced to ensure resulting plasmid was accurate. Transformation into yeast cells 50 OD of yeast cells were grown. Cells were washed, pelleted, and resuspended n competent cell buffer (1× TAE and 100 mM LiAc in H2 O). A PEG solution was prepared by diluting 1× TAE and 100 mM LiAc in 50% PEG solution. Transformation was carried out by mixing 1 µg plasmid, 100 µg carrier DNA, 100 µL of cells in competent cell buffer, and 700 µL of PEG solution. Solution was mixed and incubated at 30◦ C for 30 minutes, then heat shocked for 15 minutes at 42◦ C. Cells were pelleted at full speed for 1 minute. Pellet was washed with sterile water and resuspended in 100 µL of water. Cells were plated on minus uracil/leucine (Ura/Leu) media plates. Quantification of H2AZ vs. H2A production in vivo H2AZ and H2A genes, tagged with HA, were transformed into cells and grown on -Ura plates 12 OD600 of each were taken and pelleted at full speed for 10 minutes. Pellets were resuspended and boiled in SDS PAGE buffer to lyse the cells and denature the proteins. Product was analyzed via western blot using antibodies against the HA tag on each protein. Difference in concentration was analyzed by densitometry using ImageJ software. Crosslinking by UV light exposure to H2A and H2AZ cells pBPA mutant histones were expressed in 5 mL -ura/leu media with glucose (2% final concentration). pBPA unnatural amino acid was added to a final concentration of 1 mM. 24 ODs of H2A were taken and 96 OD of H2AZ were taken as per quantification. Cells were pelleted and resuspended in water. Cells were placed onto a metal plate within the crosslinking apparatus and exposed to UV light (365 nm) for 20 minutes, on ice at a distance of approximately 10 cm. Whole cell lysates were then analyzed by western blot analysis, alongside a negative control that had not been exposed to UV light.
Results and Discussion We aimed to make the mutants listed in Table 1 as a start to our H2AZ pBPA mutant library. Each of the mutant HTZ1 genes has a C-terminal coding region for an HA-tag. The sequence of these plasmids was obtained and verified as accurate, via sequencing. The cloning steps to building each plasmid in the final library are outlined in Fig. 2. Correct ligation reactions were verified by enzymatic digestion to identify the gene of interest dropout. The digests were verified using gel electrophoresis (Fig. 3). Lanes 2-4 and 6-7 were identified as having the correct sized insert. We
Chaudhry
The Manhattan Scientist, Series B, Volume 5 (2018)
17
Table 1. Plasmids prepared with pBPA mutation Gene
pBPA Position
Tag
HTZ-1 HTZ-1 HTZ-1 HTZ-1 HTZ-1 HTZ-1 HTZ-1 HTZ-1 HTZ-1 HTZ-1
Y65 Y58 A61 G9 S24 A26 G17 A4 G9 G12
HA HA HA HA HA HA HA HA HA HA
used clone number 2 as our wild type H2AZ vector and all TAG mutants were performed using this plasmid as the target.
Figure 2. Overview of stepwise process used to prepare plasmids.
Prior to expression of the pBPA (TAG) mutants we first investigated the expression levels of H2AZ as compared to conical H2A. This was necessary in order to account for the disparity in concentrations between H2A and H2AZ. It has been reported that H2A has approximately a ten-
18
The Manhattan Scientist, Series B, Volume 5 (2018)
Chaudhry
Figure 3. Description and Discussion
fold more expression profile as compared to the H2AZ variant. We expressed HA-tagged H2A and HA-tagged H2AZ from a vector, in yeast, and then compared the amount of protein isolated from each using western blotting against the HA epitope. We normalized to cell density and quantified protein levels via densitometry. Our results indicate that we express H2A at approximately 3 times that of the H2AZ variant, in vivo (Fig. 4 and Table 2).
Figure 4. Western blot H2A vs. H2AZ quantification image
Table 2. Quantification of H2A vs. H2AZ concentrations in vivo Areas
Average
Ratio
1 2 3
15617.518 18109.953 15184.175
16303.882
2.96
1 2 3
6498.598 6642.548 3401.820
5514.322
1
H2A
H2AZ
Chaudhry
The Manhattan Scientist, Series B, Volume 5 (2018)
19
We controlled for these differences in concentration between the conical and variant histone versions by collecting different cell amounts for our lysis reactions. We used 24 OD600 of H2A and 96 OD600 of H2AZ to account for this in vivo disparity. Our initial crosslinking experiments studied positions Y58 and A61, due to their solvent exposure at the acidic patch of H2A [2]. These sites are well documented binding residues for proteins that interact at the nucleosome. We hypothesized that these positions might bind differential proteins. The results shown in figure 5 are preliminary results obtained from crosslinking, where we only observe slight variations in the density of the crosslinking bands when comparing the two species. In the time allowed for this project we only had time to analyze crosslinks on a 4-12% SDS-PAGE gel. This allows for optimal separation and visualization of proteins between approximately 15-80 kDa. We observe very little difference in the density of the crosslinks between the two positions when compared across each histone. The results are not a complete surprise because a majority of chromatin binding proteins are larger and would be visualized in a range outside of that resolved here. These results also need to be reanalyzed and run in biological triplicates. We expect that we will be able to better resolve differential patterns in binding when we run the crosslinks on lower percentages of gels (3-8%). We will also analyze all crosslinking sites from our library to better understand how moving the probe across the histones surface might influence crosslinking signals.
Figure 5. Initial crosslinking results for H2AZ vs. H2A
Here, we describe the successful creation of a small H2AZ mutant library of expression plasmids that allows for the expression of pBPA-histones. These mutants can be crosslinked using UVlight to study protein-protein interactions. Future work includes crosslinking each of our mutant proteins and analyzing their crosslinked products on a gel with a lower percentage of polyacrylamide in order to see more subtle differences in their interaction patterns [2]. Additionally, we aim to complete this library with over 100 positions of mutations, in order to obtain a complete interactome. Finally, we plan to crosslink this interactome in comparison to that of H2A in order to investigate the differences in interactions between these two key histones.
20
The Manhattan Scientist, Series B, Volume 5 (2018)
Chaudhry
Acknowledgments This work was supported by an NIH R15 grant to Dr. Bryan Wilkins.
References [1] E. Sarcinella, P. C. Zuzarte, P. N. I. Lau, R. Draker, and P. Cheung. Monoubiquitylation of H2AZ distinguishes its association with Euchromatin. Mol. Cell Biol. 27(18) 6457–6468 (2007) [2] B. J. Wilkins, N. A. Rall, Y. Ostwal, T. Kruitwagen, K. Hiragami-Hamada, M. Winkler, Y. Barral, W. Fischle, and H. Neumann. A cascade of histone modifications induces chromatin condensation in mitosis. Science 343, 77-80 (2014)
Establishing a chromatin immunoprecipitation protocol (ChIP) to use with unnatural amino acid cross-linking Ellen Elizabeth Farrelly∗ Department of Chemical Engineering, Manhattan College Abstract. Chromatin immunoprecipitation (ChIP) is a biochemical technique that can be used to investigate the interactions between specific proteins and DNA within the cell. Each time the technique is used for isolation of nuclear proteins, the procedure needs to be optimized for the specified experiment. The goal of this work was to optimize the mechanical cell lysis and DNA shearing steps of the ChIP process, in yeast. Once optimized, we aim to perform ChIP analysis of cells that are allowed to express mutant histones, harboring unnatural amino acids, to determine the nucleosomal occupancy of the these histones throughout the chromosome.
Introduction Chromatin is the essential component of all eukaryotic cells, because it serves as the basic control center for all cellular activities. Human DNA is an extremely long macromolecule such that the DNA in one cell is long enough, approximately two meters, to be as tall as the average human. However, it is extensively packaged so that it can be contained within a cell nucleus which is approximately six micrometers in diameter. The packaging of DNA occurs primarily through the interaction of histone proteins. Histones are composed of eight subunits that interact to form an octomer, which interacts with DNA directly, coiling the DNA on a number of levels to result in extensive compaction of the DNA to form chromatin. Protein interactions with DNA occur for a number of reasons in addition to DNA packaging. Protein-DNA interactions are also important for regulating a number of processes, including transcription of genes. Chromatin Immunoprecipitation (ChIP) is a biochemical technique that allows us to understand the interactions between specific proteins and DNA within the cell (Dedon, et al., 1991). Proteins that bind DNA are chemically crosslinked, in living nuclei, forming DNA-protein complexes along the genome that are covalently locked in spatiotemporal relationships (Hoffman et al., 2015). Chromatin is then isolated and the entirety of the DNA is sheared into smaller fragments (∼ 200 − 1000 base pairs in size). This creates a pool of chromatin associated proteins bound to their respective DNA targets. Using antibodies, specific for the protein of interest, immunoprecipitation methods are used to enrich the target. The purified DNA-protein is then reverse crosslinked and the DNA is isolated. These DNA fragments can be sequenced and aligned to the organism’s genome, revealing the specific loci of protein binding along the chromatin fiber. This technique provides data that informs upon a protein’s gene regulatory role, for instance, how it might mediate chromatin function at distinct promoters, silencers, enhancers or insulators. ∗
Research mentored by Bryan Wilkins, Ph.D.
22
The Manhattan Scientist, Series B, Volume 5 (2018)
Farrelly
ChIP is a commonly used technique and a variety of ChIP procedures have been previously established and performed. However, each time the technique is used for isolation of nuclear proteins, the procedure needs to be optimized for the specified experiment. The goal of this work was to optimize the mechanical cell lysis and DNA shearing steps of the ChIP process, in yeast. Once optimized, we aim to perform ChIP analysis of cells that are allowed to express mutant histones, harboring unnatural amino acids, to determine the nucleosomal occupancy of the these histones throughout the chromosome. Here we describe the expression of epitope-tagged (HA, Human influenza hemagglutinin, YPYDVPDYA) histone proteins in yeast from a plasmid vector. Cells expressing the HA-tagged histones were crosslinked and then subjected to lysis and chromatin purification. Our results indicate that we have achieved efficient mechanical lysis and that optimal sonic shearing of chromatin occurs between 6 - 9 min. Isolated DNA-protein complexes from the appropriately sheared DNA fragments are now ready to be further processed and subjected to antibody enrichment.
Materials and Methods Histone expression and cell growth The S. cerevisiae histone H2A (HTA1) or H3 (HHT1) gene, including 450 bp up/downstream of the ORF, was cloned from yeast genomic DNA as a SacI/XhoI PCR fragment into pRS426. An HA epitope coding sequence was introduced just prior to the stop codon using standard QuikChange protocol. Histone expression vectors, under control of their native promoter, were transformed into yeast cells using standard heat shock, lithium acetate techniques. In all experiments, yeast cells were cultured in minus uracil standard synthetic complete (SC) dropout medium (1.7 g/L Difco Yeast nitrogen base without amino acids, 5 g/L ammonium sulfate, 2% glucose, and 2 g/L amino acid dropout mixture). Cells were grown at 30◦ C with shaking at 215 rpm. Cultures (100 mL total) were initiated at OD600 ∼ 0.2 from an overnight culture and the cells were allowed to grow to an optical density, OD600 ∼1.0, which we approximated as 0.5 − 3.0 × 107 cells/mL. Cross Linking and mechanical lysis Formaldehyde was added directly to the 100 ml cell culture (1% final concentration) and allowed to incubate at room temperature (RT). Initially this was done for 1 hour, but additional times were investigated, including 45 and 15 minutes. During these time periods the solution was mixed occasionally using hand agitation. After the specified incubation times, glycine was added to a final concentration of 200 mM and allowed to quench the solution for 5 min, at RT. The cells were then collected by centrifugation and the supernatant was discarded. The cells were then resuspended in 10 ml of tris buffer saline (TBS) and allowed to wash for 20 min, at RT, with occasional mixing by hand agitation. The cells were collected and then resuspended in 1 mL of TBS and transferred to a microcentrifuge tube, mixed gently and then collected again. Cells were then washed with 1 mL FA-lysis buffer (50 mM Hepes-KOH, pH 7.5, 140 mM NaCl, 1 mM EDTA, 1% Triton X-100, 0.1% sodium deoxycholate, 1 mM PMSF), at 4◦ C. The pellet was then then resuspended in 500 µL FA-lysis buffer.
Farrelly
The Manhattan Scientist, Series B, Volume 5 (2018)
23
The solution of cells was placed on ice and acid washed glass beads were added (approximately half the volume of the solution). The mixture was vortexed for short periods of time (15 or 30 seconds), and then placed on ice for 1 min. This was repeated for variable times for a total of 30, 45, and 90 s. Alternatively, one set of cells were treated with 30 s vortex, 1 min ice, for a total of 5 minutes). Lysates were collected by piercing the bottom of the tube with a small heated needle (18-ga) and then placing the tube within a new 2 mL Eppendorf tube. Both tubes were centrifuged for 1 min at 5000 rpm to allow separation of the solution from the glass beads. The cell lysate was then diluted with FA-lysis buffer until the final volume was 1.5 mL. Sonication The cell lysates were then sonicated (Qsonica model #Q55 microprobe size: 5/64”) at 25% output for 15 s, and then cooled on ice for 1 min. This process was repeated for a total sonication time ranging from 3 to 12 minutes. Following each sonication time tested, the sheared chromatin solutions were clarified by centrifugation, at 15,000 rpm. The chromatin solution (supernatant) was transferred to a new tube. Agarose gel and electrophoresis DNA separation and fragment size determinations were performed using agarose gels prepared in standard 0.5×TBE buffer (50 mM Tris, 50 mM boric acid, 10 mM EDTA). Agarose percentage varied and was performed as indicated in respective figures. All gels were stained with ethidium bromide and imaged on an Azure c600 imaging system.
Results We first tested formaldehyde crosslinking times as a way to determine if there was a significant effect on the ability to recognize our immunoprecipitation epitope of interest. Several protocols that we drew from suggested varying times of crosslinking, ranging from 10 - 120 min. We decided to test 15 and 45 min crosslinking times as our “short” and “long” exposure times. Following crosslinking, we collected cells from approximately 12 ODs of cells and analyzed whole cell lysates by western blotting against the histone HA-tag. Fig. 1 details a western blot image of our protein isolates. These results indicate that we have successfully expressed our protein in these cells and that the presence of histone proteins can be identified under different lengths of cross-linking treatment with formaldehyde. There was significant, and equal, signal densities for histone proteins treated under both conditions. We concluded that the “long” crosslinking time, if needed during optimization, would be acceptable because the antigen binding site was not occluded during crosslinking. Lanes 1-3 were also treated with shearing at different times and output (see discussion below). We see no effect of shearing on the ability to efficiently isolate the protein. However, due to the fact that we are interested in a protein known for high affinity with DNA we decided to try all future crosslinking at the shorter, 15 min times.
24
The Manhattan Scientist, Series B, Volume 5 (2018)
Farrelly
Figure 1. Western blot of target H2A protein isolated from whole cell lysates, using anti-HA antibodies, treated with formaldehyde for 15 minutes versus 45 minutes. Lane 1: sheared at 25% output for 15 min, Lane 2: sheared at 75% output for 15 min, Lane 3: sheared at 50% output for 15 min, Lane 4: sheared at 25% output for 45 min.
We then tested variable shearing techniques to optimize sonication of isolated chromatin from crosslinked cells. We tested several different output powers (amplitudes) versus time of shearing. Optimal shearing of chromatin results in a smear of DNA fragments ranging between 200-800 base pairs (bp). As shown in Fig. 2, the most concentrated smear of DNA was in wells where the sample sonicated with 25% amplitude were loaded. However, in the top of the figure you can see that there is also DNA that is large enough to be stuck in the wells after electrophoresis which lead us to make new samples, bead beat them at different times and sonicate them at 25% amplitude, then treat them with DNase and RNase to check if the smear was actually DNA.
Figure 2. Comparison of DNA fragmentation, separated by electrophoresis on a 1.2% agarose gel.
Next compared the amount of time that was necessary to bead beat the cells (mechanical lysis) using pre-sonicated DNA and compared 45 seconds of total bead beating, 90 seconds of total bead beating, and 5 minutes of total bead beating. Originally, these results looked promising due to the smear of DNA and concentration in the 100-300 bp range, but when DNase and RNase were applied to the sample and a DNA gel was run, we realized that what we thought was DNA in the gel was actually RNA, explaining the small size (Fig. 3). These results were surprising and unexpected. It seems unlikely that the smear of this size and concentrations would be RNA only. The concentrated sample that appears below the smallest marker (100 bp) is certainly expected to be RNA but not the rest of the sample. This led us to believe there was an error in treatment and potentially in our isolation technique. We decided to explore an alternative lysis procedure that utilizes enzymatic lysis versus mechanical.
Farrelly
The Manhattan Scientist, Series B, Volume 5 (2018)
25
Figure 3. Pre-sonicated DNA lysed at varying bead beating times. Each sample was treated with RNase and DNase as indicated. Sample was separated by electrophoresis on a 1.2% agarose gel.
After the conclusion of the summer research by E.F. the project was continued by Dr. Wilkins where he tried an enzymatic cell lysis approach to break open the cells and expose the DNA in solution. The samples were sonicated as indicated in Fig. 4. The concentrated signals are the most intense and at the lowest migration for the 9 minute sample. It appears, in our first attempt, that sonicating for a total time of 9 minutes, at 25% output, was optimal.
Figure 4. Using an enzymatic lysis, shearing times compared at 25% amp. We now observed a clear change in the size of the sheared DNA, in 3 min intervals, over 9 min. (Describing the type of DNA gel)
Discussion Future Research The next step in this research is to perform an immunoprecipitation of the HA-tagged histone and optimize those sequential steps of the process. Once the IP protocol is established for our specific protein we will have developed a full protocol for quick and efficient ChIP processing of the tagged histones we are targeting. The next goal is to then express a HA-tagged histone that has been site-specifically modified with a photocrosslinking unnatural amino acid. The unnatural amino acid, p-benzoylphenylalanine (pBPA), can be UV-activated to induce protein-protein crosslinks for interacting partners. It is important to note here that crosslinking from pBPA is not the same as crosslinking with formaldehyde during the ChIP process. pBPA crosslinks protein to protein, where we are only interested in DNA-protein crosslinking with formaldehyde. These two crosslinking techniques are separate chemical events that will work
26
The Manhattan Scientist, Series B, Volume 5 (2018)
Farrelly
sequentially to achieve our aim. In living cells we intend to first, crosslink histone-protein with UV-activation of pBPA and then perform ChIP analysis. We believe this will be advantageous for illuminating chromosomal occupancy of proteins that bind to the nucleosome, but do not bind DNA directly (or have a more transient binding profile). There are many proteins that act at the nucleosome but do not necessarily associate with DNA. If a histone target protein of interest has been genetically modified to contain a tag (similar to tagging with HA, but making sure it is a secondary tag) then it can be purified by IP methods using the tag’s respective antibody. Since histone proteins have a tight association with DNA, we are confident in our ability to form DNA-protein complexes during ChIP. However, a protein that has less affinity for DNA might not readily form those complexes. We propose that we can first crosslink from an HAtagged pBPA-histone to a tagged-protein of interest and then follow ChIP protocol to crosslink the histone to DNA. This will create a bridge of covalent interactions that allow us to associate the more transient protein to a specific loci of DNA. The histone-DNA binding is enriched by IP with the HA-tag and then a second IP is performed against the secondary tag on the pBPA crosslinked protein. In this way, we will enrich the DNA fragments that are associated with the secondary binding protein (Fig. 5).
Figure 5. Diagram of a 2-fold ChIP process isolating DNA attached to proteins cross-linked to other protein targets with an unnatural amino acid.
Acknowledgements
The author would like to thank Dr. Bryan Wilkins for the opportunity to work with him. This work was supported in part by an NIH R15 grant to Dr. Wilkins.
References
Dedon, P. C., Soults, J. A., Allis, C. D., and Gorovsky, M. A. (1991) A simplified formaldehyde fixation and immunoprecipitation technique on DNA in vivo. Anal. Biochem. 197, 83–90 Hoffman, E. A., Frey, B. L., Smith, L. M., and Auble D.T. Formaldehyde crosslinking: a tool for the study of chromatin complexes. J Biol Chem. 2015 Oct 30;290(44):26404-11. doi: 10.1074/jbc.R115.651679
Characterization of the role of bacterial tyrosine kinases in Bacillus subtilis biofilms Tameryn Huffman∗ Department of Chemistry and Biochemistry, Manhattan College Abstract. Bacillus subtilis is often used as a model organism for the many different developmental pathways it can undergo. One of these pathways – biofilm formation – is studied intently by microbiologists for its importance in various fields from agriculture to medicine. In its biofilms, B. subtilis secretes a matrix of exopolysaccharides (EPS) and amyloid-like protein fibers to adhere to a surface and protect itself from pathogens. The production of EPS has been shown to be regulated by the proteins EpsA and EpsB, which make up a tyrosine kinase that senses EPS and activates through phosphorylation downstream components to produce more EPS. The transcriptional regulator SinR controls the production of biofilm matrix, including the levels of Eps proteins. The role of a second tyrosine kinase pair, PtkA and TkmA, in biofilm formation is still being elucidated. Here we create deletion mutations of all tyrosine kinase components in the presence and absence of sinR. By comparing the biofilm phenotypes of these strains on two separate biofilm media, LBGM and MSgg, we are able to clarify the role of the tyrosine kinases in biofilm formation. We suggest a model in which TkmA and PtkA regulate biofilm formation through crosstalk with EpsA and EpsB. We see that the deletion of SinR, which is known to negatively regulate expression of the epsA-O operon, causes increased biofilm formation in most of the tyrosine kinase single and double mutants, but not in the EpsA, EpsB, TkmA and PtkA quadruple mutant. We hypothesize this effect is due to role of SinR in regulating expression of EpsAB proteins. Statistical analyses were used to determine significant differences between the biofilm phenotype of comparable strains, and purifications of TkmA and EpsB were attempted. Importance. Research on biofilm formation by Bacillus subtilis, which is a biological control agent, has important applications in agriculture, as understanding how these bacteria protect plants from disease is crucial to preventing crop infection and death worldwide. B. subtilis is also an effective model organism for infectious agents in humans, such as Streptococcus pneumoniae. Treatments which prevent biofilm formation of pathogens inside human hosts could be useful weapons against infectious diseases and functional safeguards against their virulence. There is potential to change the way we prevent bacterial infections as well; aiming to render pathogens harmless instead of directly eliminating them.
Introduction Bacteria and other microorganisms commonly attach to surfaces and create complex multicellular communities, called biofilms. When these biofilms occur on indwelling medical devices, they result in particularly difficult-to-treat infections, as the bacteria in a biofilm also secrete components of a matrix that can protect them from harmful agents. Therefore, research into the mechanisms through which bacteria form biofilms is useful as it can provide insight into how to prevent biofilms in medical settings. Bacillus subtilis is a non-pathogenic bacterium that has been used as a model organism due to the many developmental pathways it can take, including biofilm formation. B. subtilis creates a beneficial biofilm on the roots of its plant hosts, resulting in protection of the plant from some pathogens. In thesebiofilms, cells are encapsulated in amyloid-like protein ∗
Research mentored by Sarah Wacker, Ph.D.
28
The Manhattan Scientist, Series B, Volume 5 (2018)
Huffman
fibers made by the bacteria. Exopolysaccharides are a vital component of the biofilm matrix in many bacteria, including B. subtilis and Escherichia coli, but they are also important for other bacteria, such as Streptococcus pneumonia, where they make up the capsule, a critical virulence factor. In all of these cases, EPS production must be regulated depending on the external environment. A bacterium will interact with its environment by appropriately altering its gene expression to respond to environmental signals and conditions; biofilm formation is thus a colony-wide response to these stimuli. It is believed that biofilm regulation is at least somewhat directed by cell population density [1]. When cell population density is low, not enough signaling molecules are being produced to trigger a biofilm response. However, when cell population density is high, enough autoinducers are being produced to activate the production of EPS and other biofilm genes. EPS in turn, acts as an autoinducer as the presence of extracellular EPS signals B. subtilis to produce more EPS in a positive feedback loop that drives cells towards biofilm formation. It is known that tyrosine kinases, a two-component system consisting of a membraneembedded receptor and a cytoplasmic kinase, are often involved in the regulation of exopolysaccharide production through phosphorylation of downstream biosynthetic genes [2]. In B. subtilis, EPS production is controlled by a 15-gene operon known as epsA-O that is only expressed during biofilm formation, and is repressed by SinR in motile cells (Fig. 1). Dr. Wacker and colleagues at the Harvard University Department of Molecular and Cellular Biology have recently shown that the first two genes in this operon encode a tyrosine kinase, EpsAB, that recognizes and responds to EPS [1].
Figure 1. Transcriptional regulation of epsA-O operon by SinR and known functions of its proteins.
EpsA, the membrane receptor, directly binds EPS, thereby preventing autophosphorylation and subsequently activating the cytoplasmic kinase, EpsB. The active form of EpsB is then free to phosphorylate its downstream targets in the biosynthetic pathway, allowing further production
Huffman
The Manhattan Scientist, Series B, Volume 5 (2018)
29
of EPS in a positive-feedback loop. The interaction between protein receptor and polysaccharide is specific as EpsA recognized EPS, but not a polysaccharide produced by Staphylococcus aureus. However, the molecular details of how the protein binds the polysaccharide and what features drive specificity are unknown. Another tyrosine kinase pair - PtkA and TkmA – also has a role in biofilm formation and EPS regulation. EpsA, EpsB, TkmA, and PtkA have shown to complement and rescue biofilm formation when one or more of the four proteins were missing, suggesting possible cross talk between the two pairs [3] (Fig. 2). It may also be possible for TkmA and PtkA to affect the canonical pathway that regulates SinR and the epsA-O operon, or directly activate proteins downstream of EpsA and EpsB. The roles of PtkA and TkmA are further complicated between media types, as TkmA is thought to be necessary for biofilm formation on LBGM, but less important on MSgg [3].
Figure 2. Possible functions of TkmA-PtkA tyrosine kinase.
In this study, 38 genetically unique strains of B. subtilis were assayed in order to clarify the roles of TkmA and PtkA. Statistical analyses were used to improve the accuracy of biofilm assays, and a model is proposed describing the interactions of SinR, TkmA, and PtkA with EpsA and EpsB. Purifications of EpsB and PtkA were attempted to determine direct binding.
Materials and Methods Strains, media, and growth conditions The wild type B. subtilis strain 3610 and its derivatives were grown in lysogenic broth (5g yeast extract, 10 g tryptone, and 10 g NaCl per 1L @ pH7) at 37◦ C in a shaking incubator or on LB agar plates containing 1.5% agar. Biofilm assays were incubated at 30◦ C on MSgg and LBGM
30
The Manhattan Scientist, Series B, Volume 5 (2018)
Huffman
agar plates for either 48 hours or 72 hours. LBGM is made from LB by adding 1% (vol/vol) glycerol and 100 µM MnSO4 . MSgg is a defined media consisting of 0.1M MOPS pH7.0 (morpholinepropanesulfonic acid), 5mM KPO4 pH7.5 buffer, 2 mM MgCl2 , 0.05 mM MnCl2 , 0.001 mM ZnCl2 , 0.002 mM Thiamine, 50 µg/mL Tryptophan, 50µg/mL Phenylalanine, 50 µg/mL Threonine, 0.5% (vol/vol) glycerol, 0.5% (vol/vol) glutamate, 0.05 mM FeCl3 , and 0.1 mM CaCl2 . For solid assays on LBGM and MSgg media, 1.5% agar was used. Strains were preserved in LB broth with 20% (vol/vol) glycerol at -80◦ C. Extraction of genomic DNA. Strains were grown in 3 mL of LB at 37◦ C for 4-16 hours or until OD600 = 1. Cells were spun down in a 1.5 mL Eppendorf tube at 14,000 rcf for 5 minutes. Cell pellets were resuspended in 400 µL MilliQ H2 O, then 50 µL 0.5M EDTA and 60 µL of 20 mg/mL lysozyme were added, mixed, and incubated at 57 ◦ C for 30 min. 650 µL nuclei lysis buffer and 250 µL protein precipitation solution (Wizard kit from Promega) were added, vortexed to mix, and centrifuged for 5 min at 14,000 rcf. The soluble fraction was mixed with isopropanol to precipitate DNA. The DNA pellet was washed with 70% ethanol and re-centrifuged at 14,000 rcf for 5 min. The pellet was then dried, resuspended in 100 µL molecular grade H2 O, and finally DNA concentration was quantified using nanodrop spectroscopy (λmax = 260 nm) before the solution was stored at -20◦ C. Transformation of B. subtilis Host strains were made competent by incubating in 2 mL MC media (200 µL 10× MC concentrate, 60 µL 0.1 M MgSO4 , 1.74 mL sterilized H2 O) for 4-5 hours at 37◦ C. 300 µL of the competent cells were mixed with 8µg digested DNA and incubated at 37◦ C for 2 hours. Mixture was plated on LB plates with 100 µg/ml spectinomycin and left to incubate at 37◦ C for 12-18 hours. Colonies were re-struck onto LB + antibiotic plates and incubated at 37◦ C for another 12-18 hours. Transformation of E. coli. One aliquot of DH5α or BL21 was thawed on ice before 50 ng of plasmid DNA w as added and incubated on ice for 30 min. Then, cells and DNA were heat shocked at 42◦ C for exactly 45 sec, then rested on ice for 2 min. Cells were then incubated in 1 mL of LB for 1 hour at 37◦ C. Finally, they were plated on agar plates containing LB + 100 µg/ml ampicillin for 12-18 hours at 37◦ C. SPP1 phage transduction Movement of DNA into strains using SPP1 phage was done according to protocol by Yasbin and Young [4].
Huffman
The Manhattan Scientist, Series B, Volume 5 (2018)
31
Primer construction Primers were designed to separately clone EpsB and PtkA, each with an N-terminal His tag for expression in E. coli. Gene products were amplified with Taq DNA Polymerase according to a standard protocol with an annealing temperature of 52◦ C. PCR products were confirmed with agarose gel electrophoresis. PCR products were ligated into the pET15b vector, cut with NdeI and XhoI, using isothermal assembly (ITA). ITA products were transformed into DH5α (E. coli) and colonies were selected on LB agar plates with 100 µg/mL ampicillin. Plasmid DNA was isolated using the Qiagen MiniPrep Kit procedure and verified by sequencing. Protein expression and purification Plasmids containing EpsB and PtkA were transformed into BL21 E. coli cells and grown up in 5 mL cultures of LB at 37◦ C for 12-18 hours. The cultures were combined and added to 500 mL flasks of LB with 50 µg/mL ampicillin. Cultures were grown at 37◦ C until OD600 = 0.5 (∼2 hours). ITPG was added to the cultures to obtain a final concentration 0.5 mM and cells were grown at 37◦ C for 4 more hours. The cells were harvested by centrifugation at 4,000 rcf for 20 min at 4◦ C. Cell pellets were resuspended in 4 mL/g BugBuster Extraction Reagent containing 45 µg/mL lysozyme and were rotated at room temperature for 20 min. The cell lysate was centrifuged at 20,500 rcf for 45 min at 4◦ C. The supernatant was combined with 1.5 mL NiNTA resin pre-equilibrated in wash buffer (50 mM sodium phosphate, pH 8, 150 mM NaCl, 15 mM Imidazole, 1 mM PMSF) and mixed for 2 hours at 4◦ C. The cell pellet was resuspended in wash buffer (wash buffer + 2% tween-20 and 2% triton-X) and rotated for 4 hours at 4◦ C before it was centrifuged again at 20,500 rcf for 45 min. The supernatant + Ni-NTA resin was transferred into a chromatography column and washed with 200 mL wash buffer. The bound proteins were then eluted using 10 mL of elution buffer (37 mM sodium phosphate, pH 8, 110 mM NaCl, 250 mM Imidazole, 1 mM PMSF). Fractions were tested using a Bradford assay; the three samples with the highest concentrations of protein were combined. Fractions from each step of the procedure (pre-IPTG, post-IPTG, lysate, supernatant 1, flow through, pellet 1, wash, elution, pellet 2, and supernatant 2) were saved for SDS-PAGE analysis. SDS-PAGE Fractions from the expression and purification of PtkA and EpsB were analyzed on 12% polyacrylamide gels. Samples were combined with a 1:1 ratio of loading dye (containing SDS and β-mercaptoethanol), mixed, and further denatured at 97◦ C for 10 min. Gel electrophoresis was run in a Tris-Glycine SDS Buffer at 150 V, then stained using Bradford dye and imaged. Western blot Presence or absence of His-tagged protein after purification was confirmed through an α-His western blot. Following SDS-PAGE, proteins were transferred to membranes in Tris-glycine buffer with 20% methanol for 1 hour at 100 V. Membranes were blocked in 50 mL TBS-T buffer (1×TBS + 0.1% Tween 20) and 5% skim milk powder. The primary antibody used was α-His 1:1000 in
32
The Manhattan Scientist, Series B, Volume 5 (2018)
Huffman
5% milk. The secondary antibody used was α-rabbit 1:2000. Antibodies were recognized by chemiluminescence and membranes were imaged. Biofilm assays Strains were plated on LB from glycerol stocks at 37◦ C for 12-18 hours. Then, colonies were inoculated into 3mL LB and grown, shaking, at 37◦ C for 3 hours. 3 µL of cells were spotted onto LBGM and MSgg agar plates and incubated at 30◦ C for 48-72 hours. The biofilms formed were then assigned a severity score from 0 to 100, based on the “wrinkliness” of their colonies. These biofilm assays were repeated 8 to 16 times on both MSgg and LBGM for each strain. This created a data pool for us to preform statistical analyses on. Statistics analysis Working with a small data set required extra steps to normalize the data for accurate results. So, extreme outliers were excluded using Dixon’s Q Test [5]. Descriptive statistics were obtained for each strain, and comparisons between strains were made using Bootstrap Hypothesis Testing on StatKey [6]. 188 hypothesis tests were performed, so the Benjamini-Hochberg Procedure was used as a correction for multiple comparisons, with an optimized false discovery rate of α = 0.05 [7]. This yielded a corrected significance level of α = 0.07234 (Fig. 3). This significance level is much more reasonable than other multiple comparison corrections (Bonferroni Correction would yield α = 0.00053) and balances Type 1 and Type 2 errors.
Figure 3. Geometric representation of Benjamini-Hochberg Procedure
Results Transformation using ∆SinR genomic DNA was found to be ineffective compared to transduction (Fig. 4). ∆TkmA ∆SinR mutants resulting from transformation looked very similar to the ∆SinR strain they were derived from. Statistical analysis supported this observation, as there was no significant difference between ∆TkmA ∆SinR (transformation) and ∆SinR (µ1L − µ2L = 6.571, pLBGM = 0.0940, µ1M − µ2M = 2.375, pMSgg = 0.3540). However, ∆TkmA ∆SinR mutants that were derived using transduction looked like a mixture of ∆TkmA’s and ∆SinR’s
Huffman
The Manhattan Scientist, Series B, Volume 5 (2018)
33
Figure 4. Melding of phenotypes by transduction compared to transformation.
separate phenotypes. This was to be expected, and statistical analysis confirmed the significance between ∆TkmA ∆SinR (transduction) and ∆SinR (µ1L − µ2L = 35.589, pLBGM = 0.0000, µ1M − µ2M = 25.250, pMSgg = 0.0013). Also, transduction strains had lower variance statistics than transformation products of the same genotype. In total, 83% of ∆SinR mutants derived from transformation had statistically significant differences from transduction products of the same type. Only 38% of transformation products had statistically significant differences from ∆SinR, while 86% of transduction products were significantly different from ∆SinR. Transduction was also evaluated and confirmed using PCR (Fig. 5). Transduction products were thus determined to be more accurate, and were used instead of transformation products to represent each new genotype. 1
2
3
4
5
6
7
Figure 5. PCR confirming deletion of genomic DNA using transduction. Lane 1: Gene Ruler. Lanes 2 & 5: Wild Type. Lanes 3 & 4: Transduction Products. Lane 7: Known strain with correct genome (control).
∆SinR was shown to rescue the biofilm phenotypes of all single tyrosine kinase component deletions (Fig. 6). This was seen with ∆EpsB, as the removal of EpsB significantly decreased
34
The Manhattan Scientist, Series B, Volume 5 (2018)
Huffman
its severity on LBGM (µWT − µ∆EpsB = 25.975, pLBGM = 0.0000). Supplementing this with the removal of SinR yielded a mean severity score that was consistent with the wild type (µWT − µ∆EpsB = 12.100, pLBGM = 0.0820). This effect was also seen on MSgg (Fig. 6). ∆SinR was also seen to greatly increase severity compared to wild type (µ∆SinR − µWT = 26.114, pLBGM = 0.0002, µ∆SinR − µWT = 26.283, pMSgg = 0.0001).
Figure 6. Rescue of biofilm severity by SinR.
∆SinR mutants were shown to rescue biofilm severity in many cases (Fig. 7): µ∆TkmA ∆PtkA ∆SinR − µ∆TkmA ∆PtkA = 32.000, p = 0.0001 on LBGM µ∆TkmA ∆PtkA ∆SinR − µ∆TkmA ∆PtkA = 20.625, p = 0.0050 on MSgg µ∆EpsB ∆SinR − µ∆EpsB = 18.500, p = 0.0009 on LBGM µ∆EpsB ∆SinR − µ∆EpsB = 7.625, p = 0.0100 on MSgg.
∆EpsA ∆EpsB and ∆EpsA ∆EpsB ∆SinR did not show any statistical significance: µ∆EpsA ∆EpsB ∆SinR − µ∆EpsA ∆EpsB = 12.722, p = 0.0750 on LBGM µ∆EpsA ∆EpsB ∆SinR − µ∆EpsA ∆EpsB = 3.875, p = 0.0580 on MSgg 1 .
Purification of EpsB and PtkA (each 25kDa) was attempted, but both were unsuccessful. Electrophoresis confirmed successful PCR of primers, and their annealing to pET15b was confirmed with sequencing. Multiple transformations into BL21 cells and expressions were performed, but only one attempt saw good expression on SDS-PAGE (Fig. 8). α-His western blot showed nonspecific binding to proteins in all of the fractions, while lanes 8 and 9 from SDS-PAGE indicated that the proteins isolated from Ni-NTA resin were not our target proteins. 1 Due
to a very small difference in means, this result was determined to be a Type 1 error, and the null hypothesis was accepted.
Huffman
The Manhattan Scientist, Series B, Volume 5 (2018)
35
Figure 7. Severities of other genotypes on LBGM. Figure 8. Expression of EpsB (top) and PtkA (bottom) on SDS-PAGE (left) and western blot (right). Lane 1: Pre IPTG, Lane 2: Post IPTG, Lane 3: Lysate, Lane 4: Supernatant, Lane 5: Flow through, Lane 6: Pellet, Lane 7: Wash, Lane 8: 4ÂľL Elution, Lane 9: 10ÂľL Elution, Lane 10: Pellet 2, Lane 11: Supernatant 2. Overall expression was seen on SDS-PAGE, but little specific binding was seen on western blot.
Discussion Consistent with previous reports [1, 3], we found that mutation of the tyrosine kinase components TkmA, EpsA, and EpsB all result in decreased levels of biofilm formation. We looked at whether the SinR mutation, which is known to dramatically increase biofilm formation, relieved any of the biofilm defects we observed. We found removal of one kinase component was consistently partially rescued by deleting SinR (Fig. 7). This is different than what is seen with a deletion of both EpsA and EpsB. We believe this difference might be due to cross-talk between the two tyrosine kinase complexes. For example, EpsB relies on its partner, EpsA, to become active. If TkmA is able to activate some EpsB when EpsA is defective, then some biofilm formation would be seen even in the absence of EpsA. However, this might only be observed if there is a high level of the EPS-producing protein that needs to be activated by the tyrosine kinase, as is seen in a SinR background. This was seen in our assays and statistics, suggesting cross-talk between the two tyrosine kinases and illustrating the importance of looking at these cells when the canonical biofilm pathway is activated, or overcome by a SinR mutation.
36
The Manhattan Scientist, Series B, Volume 5 (2018)
Huffman
TkmA and PtkA seem to have another important role in biofilm formation, as a mutant without both of them lost the ability to make a severe biofilm (x̄ = 10.50 ± 0.129). However, a ∆SinR mutation partially rescued the biofilm (x̄ = 42.50 ± 0.310, p = 0.0001 on LBGM), suggesting TkmA and PtkA could interact with the canonical pathway that regulates SinR. Alternatively, this rescue could be due to the increase of EpsA and EpsB proteins that should be a result of a ∆SinR mutation. This second option supports our idea that there is cross-talk between the two proteins or that one tyrosine kinase pair can partially rescue for the absence of the other. In order to show that there is crosstalk between members of the two tyrosine kinase pairs, we need to examine whether there are physical interactions between these proteins. Isolating EpsB and PtkA may require a differeny tagging sequence, as the Ni-NTA resin and antibodies may have bound to other proteins containing high amounts of histidine. The tighter bonding of these proteins to the resin and antibodies causes non-specific binding and may be limiting our ability to obtain pure protein. Future attempts to purify EpsB and PtkA could include: manipulating the expression conditions to allow for more protein folding, increasing the solubility of EpsB and PtkA, or using an entirely different purification tag. By using these pure proteins, we could determine if they bind each other in a pull-down.
Acknowledgments This work was funded by the Manhattan College Jasper Summer Scholars Program. The author would like to thank his advisor, Dr. Sarah Wacker, for her guidance and support through this project, and Dr. Angel Pineda for direction.
References [1] Elsholz A, Wacker S, Losick R. 2014. “Self-regulation of exopolysaccharide production in Bacillus subtilis by a tyrosine kinase.” Genes & Development 28:1710-1720. [2] Grangeasse, C; Cozzone, AJ; Deutscher, J; Mijakovic, I. (2007). “Tyrosine phosphorylation: an emerging regulatory device of bacterial physiology.” Trends in Biological Sciences, 32(2): 86-94. [3] Gao T, Greenwich J, Li Y, Wang Q, Chai Y. 2015. “The bacterial tyrosine kinase activator TkmA contributes to biofilm formation largely independently of the cognate kinase PtkA in Bacillus subtilis.” Journal of Bacteriology 197:3421-3432. [4] Yasbin RE, Young FE. 1974. “Transduction in Bacillus subtilis by bacteriophage SPP1”. Journal of Virology 14:1343-1348. [5] “Treatment of Analytical Data.” Quantitative Analytical Chemistry, by James S. Fritz and George H. Schenk, Fifth ed., Allyn and Bacon, 1987, pp. 24–46. [6] Lock. “To accompany Statistics: Unlocking the Power of Data” by Lock, Lock, Lock, Lock, and Lock. StatKey, http://www.lock5stat.com/StatKey/index.html [7] Brad. “False Discovery Rate Control.” Stanford-Statweb, LSI statweb.stanford.edu/ckirby /brad/I.SI/chapter4.pdf.
Investigating the role of the Med protein in biofilm formation Juan Lara-Garcia∗ Department of Chemistry and Biochemistry, Manhattan College Abstract. Bacteria and the biofilms that they form can play an important role in agriculture; certain diseases can also be caused by biofilms growing in the body. The purpose of this study is to investigate how biofilm formation is stimulated in Bacillus subtilis, and to specifically look at the role of the protein Med in this stimulation. In order to examine Med’s role in biofilm formation I took a two-pronged approach: (1) created different strains of B. subtilis with mutations in Med and the related protein KinD and then collected data, both quantitative and qualitative, on the affect these mutations have on how different small molecules stimulate B. subtilis biofilm formation; (2) expressed and purified recombinant Med in order to examine the properties of this protein in vitro. My combined results suggest that Med does not play a role in small molecule stimulation of biofilm formation, however, future work still needs to be done to look at in vitro interactions.
Introduction Bacteria are one of the most abundant forms of life on the planet and many symbiotic relationships exist between bacteria and other organisms. There are many examples of such relationships: humans host bacteria in their intestines to aid in digestion; cyanobacteria can associate with certain types of fungi to form lichens; in plants we see Rhizobium in the root nodules of plants, where it has a role in nitrogen fixation; and the bacterium Bacillus subtilis colonizes the roots of plants, where it induces resistance to a variety of pathogens. Understanding the specific nature of these relationships between bacteria and other organisms provides insight into the way organisms can interact and benefit each other. Furthermore, in the case of some relationships, modulating the interaction can have important consequences for medicine and crop production. The focus of my research is B. subtilis which, due to its ability to protect plants from pathogens, is used as a biocontrol agent in agriculture [1]. When B. subtilis colonize plant roots, they form a community, called a biofilm. Biofilms are aggregates of microorganisms, usually bacteria, that adhere to a surface and are held together by an extracellular matrix that is produced by the cells. Biofilms are important for a variety of fields, including medicine, environmental control, and agriculture. B. subtilis is a good model organism for the study of biofilm formation as the organism has been used in other lab studies for cell division, genetic competence, gene control, multicellularity, spore formation, swarming, swimming and other aspects of bacterial cell biology [1]. Furthermore, by working with B. subtilis, a bacterium that has been studied extensively, we have many molecular tools and can better control the organism under laboratory conditions. A number of proteins have been determined to play roles in the formation of biofilms in B. subtilis. The main pathway results in the phosphorylation and activation of Spo0A, a member of the response regulator family of transcription factors (Fig. 1). Once phosphorylated, Spo0A can ∗
Research mentored by Sarah Wacker, Ph.D.
38
The Manhattan Scientist, Series B, Volume 5 (2018)
Lara-Garcia
act as either an activator or a repressor and is known to control about 120 genes directly and 500 genes indirectly [2]. The phosphorylation of Spo0A is governed by four different kinases named KinA, KinB, KinC, and KinD; KinC and KinD have the largest role in the formation of biofilm [2]. How the proteins KinC and KinD become active and first initiate biofilm formation is still not fully understood. There is some evidence that both KinC and KinD bind small molecules in order to initiate the process of biofilm formation [3]. Another protein, called Med, has a substantial biofilm phenotype and is believed to play a role in biofilm formation by acting in the same pathway as KinD. Med was originally found to be a positive regulator for the competence gene comK [2] and has been determined to be a membrane localized lipoprotein. Although B. subtilis has been extensively studied, the role of Med in the formation of biofilm remains obscure.
Figure 1. Simplified pathway of sporulation kinases and phosphorelay signal transduction pathway of B. subtilis [3]. It is believed that Med acts in the same pathway as KinD
This study, therefore, focuses on the role of Med in the formation of biofilm and the way that this mysterious protein interacts with KinD and other small molecules. In order to investigate the role of Med in biofilm formation, I first created mutants of B. subtilis that are missing proteins such as the kinase KinD and the Med protein. I also looked at the way that different small molecules such as pyruvate, malate, and root extracts affect the formation of biofilm and their effects on different luciferase reporters.
Materials and Methods Preparation of calcium competent cells In order for cells to better take up plasmids for the cloning and expression of the Med protein I created calcium competent E. coli cells. To do this, E. coli (strain DH5a for plasmid preparations and strain BL21 for protein preparation) were plated on LB from frozen glycerol stocks and grown overnight at 37â—Ś C. The next day, a starter culture was prepared by selecting a single colony of E. coli from a fresh LB plate and inoculating a 10mL culture of LB which was grown, shaking, overnight at 37â—Ś C. On the third day, 1 L of LB media was inoculated with 10 mL of the starter
Lara-Garcia
The Manhattan Scientist, Series B, Volume 5 (2018)
39
culture and grown in 37◦ C shaker until the OD600 of the culture reached 0.35-0.4, and the culture was immediately chilled on ice. Chilled cells were harvested by centrifugation at 4000 rpm for 15 minutes at 4◦ C. The cell pellet was resuspended in 400 mL of MgCl2 and cells were then harvested by centrifugation at 3000 rpm for 15 minutes at 4◦ C. The cell pellet was resuspended in 200 mL of ice-cold CaCl2 and allowed to rest on ice for 20 minutes. Cells were then harvested by centrifugation at 3000 rpm for 15 minutes at 4◦ C and resuspended in 50 mL of ice cold 85 mM CaCl2 , 15% glycerol. Cells underwent a final round of centrifugation at 2100 rpm for 15 minutes at 4◦ C. Supernatant was discarded, and the pellet was resuspended with 2 mL of ice cold 85 mM CaCl2 , 15% glycerol. Cells were aliquoted in 50 mL portions into sterile microfuge tubes that were chilled on ice and stored in a −80◦ C freezer. Table 1. List of Strains Strain
Description or genotype
Reference or source
TT26
∆med::erm; sacA::PsdpA -lux, Cm
created by Tantan Guo
3610
wild laboratory strain that is capable of forming biofilms
JLG01a
∆med::erm
created by transduction of TT26 phage into 3610
YC1078
KinD::KinDoverexpression (with IPTG-inducible hyperspank promoter), spec
created by Yunrong Chai
JLG03a
∆med::erm; sacA::Peps -lux, Cm
JLG04a
∆med::erm; KinDoverexpression, spec
JLG05b JLG06a
∆med::erm; sacA::PsdpA -lux, Cm; KinD::KinDoverexpression (with IPTG-inducible hyperspank promoter), spec sacA::PsdpA -lux, Cm; KinD::KinDoverexpression (w/ IPTG inducible hyperspank promoter), spec
created by transduction of TT26 phage into ALM89 created by transforming genomic DNA from YC1078 into JLG01a created by transforming genomic DNA from YC1078 into TT26 created by transforming genomic DNA from YC1078 into CY136
eJLG01
pJLG01 for expression: pET15b with full length Med protein with N-term His tag, Amp
eJLG02
pJLG02 for expression: pET15b with Med protein that is shortened at its N-terminus, with N-term His tag, Amp
Primer design and PCR Plasmids containing Med were designed for protein expression in E. coli. To do this, I designed primers that would PCR amplify the med gene and allow for insertion into a plasmid through isothermal assembly. The following primers were created to incorporate full-length Med from B. subtilis and N-terminal shortened Med into pET15b, each with an N-terminal His tag: Primer 1: primer sequence from the back end of med gene to incorporate into plasmid, used in both full length and shortened version. GGTGCCGCGCGGCAGCCATAAGTTGATCACAAGGCTTGTCATGATC
40
The Manhattan Scientist, Series B, Volume 5 (2018)
Lara-Garcia
Primer 2: primer sequence from the front of the full-length med gene to incorporate into plasmid AGCAGCCGGATCCTCGAGCATTACTCGTTTTTTGGCAGCTCG Primer 3: primer sequence for the front of shortened med gene (missing first 28 amino acids) to incorporate into plasmid GGTGCCGCGCGGCAGCCATAAGGTCGGCATGCTCTTTC. Transformation of E. coli After creating plasmids for recombinant Med, I had to express these genes by transforming them into E. coli. In order to do this, one aliquot of 50 µL of E. coli was combined with 1µL of plasmid DNA and incubated on ice for 30 minutes. The cells were then placed in 42◦ C water bath for exactly 45 seconds, followed by ice for 2 minutes to allow the E. coli cells to recover. The cells were then grown, shaking, in 1mL of LB at 37◦ C. Cells were plated on LB agar plates supplemented with 100 µg/ml Ampicillin. The plates were incubated at 37◦ C for 12-16 hours after which time individual colonies were selected. Phage transduction of B. subtilis To create B. subtilis strains that had the desired mutation, I used SPP1 phage transduction according to standard protocols. A colony from a 1-day old plate was inoculated into 2 mL of TY media (100 mL LB + 10 mM MgSO4 + 0.1 mM MnSO4 ) and grown at 37◦ C in a roller drum until the culture was dense (∼6 hours). SPP1 phage were then diluted in TY liquid media by adding 100 µL of SPP1 phage with 200 µL of dense culture in small culture tubes. These tubes were vortexed briefly and incubated statically at 37◦ C for 15 minutes. To each culture tube an additional 3 mL of TY soft agar (TY media + 0.5% agar) was added, this soft agar culture mixture was poured onto the fresh TY plates (TY media + 1.5% agar). The plates were allowed to dry and then incubated at 37◦ C overnight. The following morning, 5 mL of TY media was added to each plate and the lysate was scraped up into a tube and vortexed vigorously. The cells were then spun down at 5000 rpm for 10 minutes, and the lysate was transferred to a new tube where it was combined with 10 µL of 10 mg/mL of DNase and incubated statically at room temperature for 30 minutes. After incubation the lysate was filter sterilized with a 0.2 µm filter and stored at 4◦ C. To transduce new strains with the phage lysate, single colonies of the receptor bacteria were inoculated from 1-day plates into 3mL TY media, where they were grown at 37◦ C for 6-8 hours. In a 15 mL tube, 900 µL of the recipient strain were combined with 9 mL of TY media and 100 µL of lysate. The mixture was incubated statically at 37◦ C for 15 minutes followed by centrifugation at 5000 rpm for 10 minutes. Finally, the supernatant was poured off and the remaining cell pellet was resuspended in the remaining TY media and plated onto the appropriate antibiotic plate supplemented with 5 mM citrate. Luciferase assay B. subtilis were cultured from 1-day old colonies and grown in LB at 37◦ C for ∼3 hrs (to mid-log-phase) and then diluted 1:100 with fresh LB media. 200 µL of diluted cells were added
Lara-Garcia
The Manhattan Scientist, Series B, Volume 5 (2018)
41
to each well of a 96-well plate (white with a clear bottom). Media was supplemented with the appropriate concentrations of malate or pyruvate from 1 M stock solutions or 2% root extract samples (provided by Alexis Brown). Cells were grown for 14 hours at 37◦ C with slow shaking in a FilterMax F5 multi-mode plate reader (Molecular Devices). Luminescence and absorbance at 600 nm readings were taken every 10 minutes. To analyze data, luminescence readings were normalized against cell absorbance for each time point. Data shown are representative results from at least two different experiments. Pellicle biofilm assay B. subtilis were cultured from 1-day old colonies and grown in LB at 37◦ C for ∼3 hrs (to mid-log-phase) and then diluted 1:1000 in 8 ml LB media. Media was supplemented with 1% root extract (80 µL to 8 mL) or the appropriate concentration of pyruvate or malate from a 1 M stock solution. Biofilm cultures were incubated at 30◦ C for 48-72 hours before images of the cultures were taken. Biofilm with supplements were compared to biofilms in LB alone and in a biofilm-supporting media, LBGM (Fig. 2). Top: Wt in 8 mL LB
Figure 2. Biofilms of Wt B. subtlis in LB and LBGM media. Top three wells show biofilm created by wild type in 8 mL of LB media, and bottom three wells show biofilm created by wild type in 8 mL of LBGM.
Bottom: Wt in 8 mL LBGM
Purification of Med Protein For future in vitro assays, full-length and N-terminal shortened His-tagged Med proteins was expressed and purified. Transformed BL21 cells were grown up overnight and then diluted 1:70 into 500 ml LB with 100 µg/mL ampicillin. These cultures were shaken at 37◦ C until they reached an OD600 of 0.5. Protein expression was induced with 500 mM IPTG and cells were grown at 37◦ C for 4 hours. Cells were harvested by centrifugation at 4000 rpm for 20 minutes and cell pellets were stored at -20◦ C overnight. Bacterial cells expressing Med constructs (4 g) were resuspended in 40 mL of Bugbuster Extraction Reagent supplemented with 1mM PMSF and 25 µg lysozyme and incubated at room temperature for 30 minutes to allow the cells to lyse. The lysate was centrifuged at 20,500 rpm for 45 minutes at 4◦ C. The membrane pellet was resuspended with 10 mL of wash buffer (50 mM
42
The Manhattan Scientist, Series B, Volume 5 (2018)
Lara-Garcia
sodium phosphate, pH 8, 150 mM NaCl, 15 mM imidazole, and 1 mM PMSF) supplemented with 2% Tween-20 and 2% triton-X over 4 hours before it was re-centrifuged at 20,500 rpm for 45 minutes. The supernatant from the initial spin was combined with 1.5ml of pre-equilibrated NiNTA resin and allowed to mix for 4 hours at 4◦ C. The supernatant and Ni-NTA resin was poured through a mini column and the flowthrough was collected. Afterwards the column was washed with 200 mL of wash buffer, when the buffer had run out of the column the protein was eluted with 8 mL of elution buffer (37.5 mM sodium phosphate, pH 8, 11 mM NaCl, 250 mM imidazole, and 1 mM PMSF). Fractions of the elution were collected and were checked for the presence of protein using a Bradford assay. Fractions that contained protein were combined together. Samples were collected throughout the purification and analyzed on a 12% SDS-PAGE gel. Full-length Histagged Med is expected to be 38 kDa and the His-tagged N-terminal shortened Med is expected to be 34 kDa.
Results In order to determine whether Med is responsible for recognizing small molecules that are known to stimulate biofilm formation, I tested the effect of pyruvate, malate, and root extract on B. subtilis with a deletion of the med gene in two types of assays, a luciferase reporter assay and in pellicle biofilm assays. The luciferase assay relies on a transcriptional response from genes that are activated by Spo0A, and thus it is a high-throughput, but indirect assay to determine whether a small molecule stimulates biofilm formation. For this assay, I used two reporters, one based on the promoter for sdpA, which is highly sensitive for Spo0A activity but not directly a part of the biofilm pathway, and one based on the promoter for tapA, which is also under Spo0A control and encodes a component of the biofilm matrix. My data indicates pyruvate, a small molecule known to bind KinD [3], is able to increase expression of the sdpA reporter, in wildtype 3610 cells, in a dose-dependent manner (Fig. 3A, CY136 curve). I then tested whether a strain that is missing KinD is also able to be stimulated by pyruvate. My data shows that ∆kinD bacteria have a much smaller basal expression of the sdpA reporter, and that this is not further stimulated by pyruvate (Fig. 3A, CY137 curve). In contrast, ∆med bacteria show moderate baseline expression of sdpA, which is further increased by the addition of pyruvate, in a dose-dependent manner (Fig. 3A, TT26 curve). To determine if the same effect of pyruvate was seen on a more relevant biofilm reporter, I conducted luciferase assays with bacterial strains harboring the tasA-luciferase reporter. My results are consistent, in that wild type (ALM91) and ∆med (JLG02a) bacteria do increase the expression of the reporter with pyruvate, but ∆kinD (RL5313) bacteria are unresponsive to pyruvate (Fig. 3B). The effect of pyruvate on ∆med and ∆kinD bacteria was further examined through a pellicle biofilm assay. In this assay, I tested the biofilm phenotype of all six strains from the luciferase assay in LB media supplemented with 5 mM pyruvate. Wild type strains are able to form a weak biofilm on the sides of the dish. In ∆med mutants a very weak biofilm can be seen in the presence of 5 mM pyruvate, while ∆kinD mutants are unable to form a biofilm (Fig. 2C). As expected, I see similar result in both versions
Lara-Garcia
The Manhattan Scientist, Series B, Volume 5 (2018)
43
of the wild type and mutant proteins, that carry either an sdpA reporter or tasA reporter. Figure 2A
15000
A
Luciferase Activity
sdpA
3000
10000
2000
5000
1000
0 0
100
200
300
400
500
600
700
800
Time (min)
Wt ΔMed+5mM pyruvate
Wt+2.5mM pyruvate ΔKinD
Wt-sdpA
B
tapA
0 0
Wt+5mM pyruvate Figure 2B ΔKinD+2.5mM pyruvate
100
200
300
400
Time (min)
ΔMed ΔKinD+5mM pyruvate
500
600
700
800
ΔMed+2.5mM pyruvate
Δmed-spdA ΔKinD-sdpA
3A
10000
Wt-tapA
Δmed-tapA ΔKinD-tapA
Figure 3. The effect of pyruvate on biofilm formation. (A) Pyruvate has a dose-dependent effect on the expression of an sdpA-luciferase reporter in wildtype and med strains, but not in kinD. CY136 is wildtype bacteria, TT26 is med, and CY137 is kinD. (B) Pyruvate has a dose-dependent effect on the expression of a tapA-luciferase reporter in wildtype and med strains, but not in kinD. ALM91 is wildtype bacteria, JLG02a is med, and RL5313 is kinD. (C) (bottom) 5 mM pyruvate is added to cultures of LB and bacteria that are WT, med and kinD.
Malate is a molecule, similar in structure to pyruvate, which has also been shown to stimulate B. subtilis biofilm formation [1]. In Fig. 4B we see that wild type B. subtilis is capable of forming a moderate biofilm in LB + 5 mM malate, which is very similar to the wild type in LB media alone. When looking at the ∆med and ∆kinD mutants, we see that neither strain, regardless of the reporter, was able to form a biofilm. This data is consistent with my luciferase reporter data that shows that malate does not strongly stimulate biofilm assays in my hands. I didn’t see much stimulation of the wildtype bacteria (CY136), the ∆med mutant (TT26), or the ∆kinD mutant (CY137) in the presence of malate.
44
The Manhattan Scientist, Series B, Volume 5 (2018)
3A
Luciferase Activity
10000
Lara-Garcia
Wt-sdpA
Δmed-spdA ΔKinD-sdpA
Wt-tapA
Δmed-tapA ΔKinD-tapA
5000
0
0
200
400
600
800
Time (min) Wt ΔMed+5mM malate Wt ΔMed ΔKinD
Wt+5mM malate ΔKinD Wt+2.5mM pyruvate ΔMed+2.5mM pyruvate ΔKinD+2.5mM pyruvate
ΔMed ΔKinD+5mM malate Wt+5mM pyruvate ΔMed+5mM pyruvate ΔKinD+5mM pyruvate
Figure 4. Effect of malate on biofilm formation. (A) (left) Malate does not stimulate expression of an sdpA-luciferase reporter in wildtype+ mutant cells; (B) (right) 5 mM malate is added to cultures of LB and bacteria that are Wt, ∆med, and ∆kinD.
Fig. 5 shows the sdpA luciferase reporter of wildtype B. subtilis (CY136) is slightly stimulated in the presence of root extract wash. Root extract, however, has little effect on the luciferase reporter of the ∆kinD mutant (CY137). The ∆med strain (TT26), responded similarly to the wildtype bacteria, suggesting KinD, but not Med, might have a role in sensing tomato root extract. Biofilm data with root extract was inconclusive (data not shown). CY136 CY136+1% Root Extract Wash TT26 TT26+1% Root Extract Wash CY137 CY137+1% Root Extract Wash
Figure 5. The effect of root extracts on sdpA-luciferase reporter. Root extracts stimulates the expression of an sdpAluciferase reporter in wildtype and med strains, but not in kinD.
Luciferase Activity
15000
10000
5000
0
0
200
400
600
800
Time (min)
Another aspect of my research was to purify the Med protein to aid in future in vitro experiments on the role of Med on biofilm formation. I cloned and attempted the purification of two constructs of Med, one that is full-length and the other which is missing the first 28 amino acids of its protein sequence, which include the targeting sequence and the putative site for lipidation. This second construct was designed based on homology to a Med homolog from Bacillus halodurans that has been purified and crystallized. Both proteins have an N-terminal His tag and I attempted
Lara-Garcia
The Manhattan Scientist, Series B, Volume 5 (2018)
45
to purify with a Ni-NTA affinity resin. Fig. 6 shows the gel for our expression of eJLG01, our full-length Med protein, which did not produce bands at the expected size. When compared to our expression of shortened Med, eJLG02, we see a band at the expected molecular weight in our elution samples. 1 2 3 4 5 6 7 8 9 10
1 2 3 4
5
6
7 8 9 10
* A
B
Figure 6. 12% SDS PAGE gel for eJLG01 and eJLG02. (A) Gel for eJLG01 (full length med protein), which has a predicted molecular weight of 38kDa. (B) Gel for eJLG02 (shortened med protein), which has a predicted molecular weight of 34kDa. Wells are in the following order: 1, Ladder; 2, Pre-IPTG; 3, Post-IPTG; 4, Supernatant; 5, Flowthrough; 6, Pellet; 7, Wash; 8, Elution 1; 9, Supernatant 2; 10, Elution 2. ∗, indicates shortened Med protein.
Discussion As shown in the previous section, most of my work depends on two types of data, luciferase reporter activity and the biofilm phenotypes determined by biofilm assays. My research can be split into two parts one in which I look at the effect that different small molecules have on the formation of biofilm and the second in which I examine the effect that different mutations of B. subtilis have on biofilm phenotype. In Fig. 4 we see that wild type B. subtilis is capable of forming a moderately strong biofilm in LB with 5mM malate; this is very similar to the biofilm of wild type bacteria in LB media alone (Fig. 2). We also see this result in both versions of the wild type that carry either a sdpA reporter or tasA reporter. When looking at the ∆med and ∆kinD mutants, we see that neither strain, with either a sdpA or tasA reporter, were able to form a biofilm. This data is consistent with luciferase reporter data that shows that ∆med mutants and ∆med mutants in the presence of malate don’t have a significant difference in luciferase activity, the same can be said for ∆kinD and ∆kinD mutants in the presence of malate. These results suggest that neither Med or KinD play a role in detecting malate. However, more replications are needed in order to more confidently comment on the effect that malate has on the stimulation of biofilm. Previous research has indicated that malate should stimulate the formation of biofilm through KinD, a result that I have been unable to reproduce possibly due to degradation of malate, pH or other factors that were not fully controlled.
46
The Manhattan Scientist, Series B, Volume 5 (2018)
Lara-Garcia
In Fig. 3C we see that wild type strains are able to form a weak biofilm on the sides of the dish in the presence of pyruvate. In ∆med mutants a very weak biofilm can be seen in the presence of 5 mM pyruvate, while the ∆kinD strain was unable to form a biofilm. This supports the idea that Med doesn’t play a role in sensing pyruvate while KinD does. Luciferase reporter data also supports this idea as wild type strains and ∆med strains show a dose dependent response to pyruvate, whereas ∆kinD shows no response to pyruvate even when the dosage of pyruvate is increased. Another important part of our study was to purify the Med protein in order to investigate how Med and KinD interact in vitro. The His-tagged full-length Med protein is predicted to be 38kDa, which allows us to run an SDS PAGE gel of the different samples that were collected and compare it a ladder with known sizes to determine which of the bands is most likely to contain our protein. Fig. 6A shows the gel for our expression of eJLG01, our full-length Med protein, which did not produce bands at the expected size. When compared to our expression of shortened Med, eJLG02, we see a band at the expected distance in our elution samples. Given that this expression was successful the protein from this sample was further purified and concentrated in order to use in the future to look at in vitro interactions between Med and KinD. Future work will continue to focus on the purification of the full-length Med protein, as eJLG01 didn’t purify there is a possibility that the protein was trafficked out of the cell and wasn’t seen in the gel. I will also try to further concentrate the purified eJLG02 version of Med in order to conduct in vitro testing of this protein to look at interactions with KinD and other small molecules, which in turn may provide insight into the role of full-length Med.
Acknowledgments This work was funded by the Manhattan College Jasper Research Scholars program. The author would like to thank his mentor, Dr. Sarah Wacker, for her continued guidance and support.
References [1] Chen Y, Cao S, Chai Y, Clardy J, Kolter R, Guo JH, Losick R. (2012). A Bacillus subtilis sensor kinase involved in triggering biofilm formation on the roots of tomato plants. Mol Microbiol 85, 418-430. [2] Banse, A. V., Hobbs, E. C. and Losick, R. (2011). Phosphorylation of Spo0A by the histidine kinase KinD requires the lipoprotein med in Bacillus subtilis. J Bacteriol 193, 3949-3955. [3] Wu R, Gu M, Wilton R, Babnigg G, Kim Y, Pokkuluri PR, Szurmant H, Joachimiak A, Schiffer M. (2013). Insight into the sporulation phosphorelay: crystal structure of the sensor domain of Bacillus subtilis histidine kinase, KinD. Protein Sci 22, 564-576.
Purification and isolation of glucose oxidase for use in a bio-battery Monique Ng∗ Department of Biology, Manhattan College Abstract. The main objective of this project was to create an environmentally friendly biofuel cell by genetically engineering Aspergillus niger glucose oxidase (GOx) variants for increased stability, catalytic activity, and affinity for a gold nanowire electrode. GOx is a stable and well-studied enzyme that catalytically oxidizes glucose. The enzymatic reaction is an attractive source of electron production for the anodic half-cell of a biological battery. We genetically designed the coding sequence of GOx for optimal expression in E. coli and aimed to express and purify each of our GOx variants. Our variants included a previously reported GOx with increased kinetic activity and stability where we engineered a cysteine tag onto this protein. Thiol groups have a high affinity for gold and we expect that we can sequester the electron producing GOx directly to a gold anode, optimizing electron transfer and increasing the energy potential. We report here that we successfully expressed all of our mutants and developed a protocol for purification of these enzymes from large scale recombinant expression in E. coli.
Introduction While conventional batteries are sufficiently powerful, they hold many risks to the environment and to the health of an individual. Exploitation of fossil fuels and other non-renewable energy has brought attention to the negative impact on the health of the environment. The traditional battery is not environmentally green, due to toxic chemicals such as acids, lithium, cadmium, and alkaline which can leak into the environment and endanger the surrounding wildlife (Energizer, 2017). As a result, the climate has gotten warmer due to the fossil fuel consumption, which increases atmospheric carbon dioxide levels (Zecca and Chiari, 2009). This has driven the interest for alternative and greener options such as biological batteries, which converts chemical energy to electrical energy by electrogenerated redox reactions. Subsequently, misuse of traditional batteries can lead to chemical burns, irritation of the respiratory system, and explosions, just to name a few of their other potential harms (Azocleantech, 2008). Glucose fueled bio-batteries (biofuel cells), on the other hand, provide a greener alternative energy source that allows for a continuous supply of energy and are low maintenance when supplied with a steady source of glucose. Redox enzymes power biofuel cells through enzymatic reactions that either utilize or release electrons during their catalytic turnover. These oxidases and reductases are renewable and less expensive compared to the precious metal catalysts used in conventional fuel cells. In addition, these enzymes are optimized in neutral pH buffers, making them an attractive candidate to power implantable medical devices (Zhu et al., 2014). Sugars, which are readily available in our bodies, are a suitable fuel for bio-batteries because a single turnover can produce up to 24 electrons. For free standing bio-batteries, sugar is cost-efficient, widely accessible, and safe. This high-energy battery may have future commercial medical applications such ∗
Research mentored by Bryan Wilkins, Ph.D.
48
The Manhattan Scientist, Series B, Volume 5 (2018)
Ng
as a blood glucose monitor for diabetics or a pacemaker in patients with cardiac complications. The work reported here will focus on combining these biological redox reactions with nanoscale electrodes, which will be of critical importance for the development of efficient batteries that can be minimized in size for implantation. Medical applications for bio-batteries require that they simultaneously become smaller and more efficient. Our biofuel cell aims to maximize electron shuttling by utilizing variants of Aspergillus niger glucose oxidase (GOx) and was genetically engineered for increased stability, activity, and affinity to gold nanowires. GOx possesses high specificity towards glucose, oxidizing it, and producing electrons. GOx is a flavoprotein oxoreductase that oxidizes glucose by using oxygen as the electron acceptor to produce glucono-β-lactone (glucanolactone) and hydrogen peroxide (Wohlfahrt et al., 1999). It is one of the most well studied enzymes for use in biofuel cells because the reaction generates electrons at the anode. In the literature, A. niger GOx mutant variants were reported to be more stable and catalytically active than its wild type form (Holland et al., 2012). Interestingly, the cited enzyme was not engineered to be used in a biofuel cell and we believe it possess great potential in that role. We took advantage of these previously reported variants and employed them as a way to increase catalytic activity and electron production in a biofuel cell. We hypothesize that the more catalytically active enzyme, having a higher enzymatic turnover rate, will create an electron sink that further optimizes energy potential at the anode. Interestingly, this protein is very bulky, making direct electron transfer difficult due to steric hindrance (Cosnier et al., 2016). As a result, the protein often needs a mediator, such as dimethylformamide (DMF), to help shuttle the electrons through the enzymatic channel. In collaboration with the Santulli research group (Manhattan College), we are working toward eliminating the need for a mediator by using gold nanowire at the anode and direct attachment of the enzyme to the metal. We believe that this will aide in achieving direct electron transfer by orienting the redox center at the shortest distance possible from the electron acceptor. The sequestering mechanism relies on biological thiol groups (cysteine residues) being added to the terminal ends of GOx. It is well established that thiol groups have a high affinity for gold and we propose that adding these chemical handles will create a mechanism by which GOx can adhere to the anode wires. The binding, in combination with the nano-sized electrode, is expected to increase the battery potential while minimizing the total battery size. We suggest that if GOx has an increased affinity for the anode, while simultaneously being more catalytically active, we can increase the efficiency of a standard glucose biofuel cell by a significant margin. The distance that the electron must travel (typically through mediator molecules) from the enzyme will be essentially reduced to zero, creating a system that allows for the direct transfer of the electron to the anode. By doing so, this will allow the electron to transfer at a faster rate.
Ng
The Manhattan Scientist, Series B, Volume 5 (2018)
49
Project Design Glucose oxidase (GOx) variant design Mutant variants of A. niger GOx were genetically engineered to design a biofuel cell that has increased activity and affinity to gold nanowire. Four GOx derivatives were engineered: GOx-Wt (Wt, wild type), GOx-cys (cys, cysteine), GOx-4mut (mut, mutant: T56V, T132S, H469C, C543V), GOx-4mut-cys (Fig. 1). The sequences for glucose oxidase (GOx) variants were engineered from A. niger GOx. This sequence was derived from NCBI Genebank number AAA32695 (E.C 1.1.3.4), and provided the genetic framework for the wild type (Wt) and mutant sequences. The coding sequences were optimized for E. coli and synthesized by IDT (https://www.idtdna.com/pages). Variants were engineered to contain a coding region with an N-terminal 6x histidine tag and a proximal TEV protease site. GOx-Cys was designed with a C-terminal 3x cysteine tag. The quadruple GOx mutant (4-mut) was engineered for increased stability, efficiency and directly electron shuttling to anode. The gene cassettes were cloned into a pDCF-duet expression vector via EcoRI and PstI sites, using standard cloning techniques. This placed the expression of GOx under control of the T7lac promoter.
Figure 1. GOx gBlock (IDT) units of wild type and mutant variants.
GOx expression and lysis BL21(DE3) cells, transformed with the expression vector were IPTG inducible. Two expression temperatures were explored, 30◦ C and 37◦ C, thus cells were grown in LB medium (supplemented with spectinomycin), incubated at the respective temperatures until induction at a cell density, OD600 ∼ 0.8. Expressions were performed in 2 L volumes, induced with IPTG (0.5 mM) and allowed to produce protein for times indicated. Small scale (10 ml) test expressions were assayed at times: 0, 2 h, 4 h, 6 h, and overnight (12-16 h). Large scale expressions were stopped at 4-6 h, pelleted and washed with 10 mM Tris-HCl–50 mM NaCl, pH 8.0 (Tris buffer). The cleaned pellet was resuspended in Tris buffer (supplemented with 1 mM pMSF and 0.2 mg/mL lysozyme) and allowed to incubate at room temperature for 30 min, with shaking, and then lysed by sonification. GOx migrates to the inclusion bodies, as
50
The Manhattan Scientist, Series B, Volume 5 (2018)
Ng
evidenced by standard SDS-PAGE electrophoresis and western blotting against the expressed histidine tag. Lysates were clarified by centrifugation (18,000 rpm, Sorvall) and the insoluble fraction was washed with Tris buffer containing 20 mM Tris-HCl, 100 mM NaCl,1mM EDTA at pH 8.0 (Tris wash). Subsequently, a 1% TritonX-100 Tris wash was performed, and then another without Triton. GOx solubilization and refolding The insoluble fraction was further washed with 2 M urea for 30 minutes at room temperature. The GOx insoluble fraction was again collected by centrifugation. GOx was extracted from the pellet using 8 M urea-30 mM dithiothreitol (DTT) and incubated at room temperature for 1 h, with shaking. The enzyme solution was concentrated (∼5 mg/mL) and then diluted 100-fold in the refolding buffer (1 mM reduced glutathione [GSH], 1 mM oxidized glutathione [GSSG], 0.05 mM flavin adenine dinucleotide [FAD], 10% [vol/vol] glycerol, 20 mM Tris-HCl, pH 8.0) and allowed to fold for 7 days at 10◦ C. The refolded protein was concentrated and buffer exchanged into reaction buffer (50 mM sodium phosphate, pH 7.4). GOx functional verification Refolded GOx activity was tested using the Invitrogen Molecular Probes Amplex Red Glucose/Glucose Oxidase Assay Kit (A22189). This kit was used as indicated in the manufacturer’s protocol. SDS-PAGE electrophoresis and western blot All SDS-PAGE electrophoresis was performed in standard running buffer (25 mM Tris, 192 mM glycine, 0.1% SDS), on 12 % acrylamide gels, where samples were denatured by heat in SDS loading buffer. Staining was achieved using Comassie. Western blotting was performed by transferring proteins to PVDF membrane in standard transfer buffer (25 mM Tris, 192 mM glycine, 20% methanol) for 1 h, 100 v constant. Membranes were checked for protein transfer efficiency using Ponceau-S. Following the removal of Ponceau-S, the membranes were blocked in 5% milk-TBS (w/v, 50 mM Tris, pH 7.5, 150 mM NaCl), for 1 hr with shaking. The blocking solution was removed and the membrane was incubated in a 1:5000 dilution of anti-His antibodies (5% milk-TBS), overnight, with shaking at 4◦ C. The membrane was then washed with TBS and incubated in a 1:10000 dilution of anti-mouse HRP-conjugated secondary antibody (5% milk-TBS), for 45 min at 4◦ C. The membrane was washed excessively and then activated using Amersham ECL select substrate and then imaged.
Results A timed expression of GOx-Wt and GOx-3cys was performed to determine optimal production levels of GOx. The protein of interest is roughly 65 kDa. We induced cells for expression and
Ng
The Manhattan Scientist, Series B, Volume 5 (2018)
51
assayed whole cell lysates at time points every 2 hours through 6 hours and then allowed for a longer overnight expression. Following SDS-PAGE and Coomassie staining, an obvious band begins to concentrate around the 65 kDa marker for GOx as the expression continues over time (Fig. 2). Overnight expressions yielded a lower production level of GOx (data not shown). It is presumed that the cell is producing peak amounts of GOx at 4 h and 6 h.
Figure 2. Hourly test expression of GOx for wild type and cysteine tagged proteins.
While we observed clear expression of our protein our time assays were from whole cell lysates. We expected the protein to be isolated at high concentrations from large scale expressions. Unfortunately, this was not the case. We attempted, under several conditions (alternative buffers, temperatures, lysis techniques), to isolate GOx from the soluble fraction, but GOx continually migrated to the insoluble fraction under all conditions. There were very small amounts of GOx observed in the soluble fraction and we attempted to enhance this concentration by performing nickel affinity purification, but we were unsuccessful. Literature searches revealed that other groups had similar issues when trying to express GOx, recombinantly, in E. coli. Witt, S. and colleagues extracted GOx from the inclusion bodies and refolded the protein prior to activity (Witt et all, 1998). We decided to follow this technique and therefore extracted and refolded GOx. We washed the insoluble fraction with 2 M urea to remove any contaminants, and then extracted GOx from the fraction using 8 M urea-DTT. We assessed the extraction from the inclusion body (load) through each urea wash and extraction step (Fig. 3). The 2 M urea acted as a wash to remove as much contamination as possible prior to the extraction. While the gel represented in Fig. 3 is overloaded, it is still clear that there remains a large concentration of GOx in the inclusion body following the 2 M urea wash. We then extracted protein in 8 M urea and determined that our initial extraction conditions were insufficient. Very little GOx was found in the soluble 8 M fraction when incubated at 4◦ C for 1 h (evidenced in lane 3 and 4, Fig. 3). We performed a second extraction of the inclusion body at 30◦ C for 1 h, and this yielded a much more efficient extraction. The 8 M extraction was concentrated and the insoluble GOx was refolded in the presence of FAD and allowed to refold for a week at 10◦ C. We refolded both extraction samples from
52
The Manhattan Scientist, Series B, Volume 5 (2018)
Ng
Figure 3. GOx solubility during the extraction process. The GOx band was confidently assigned by size and migration, previously verified by coomasie and western blotting
the 4◦ C and 30◦ C extractions. Refolded protein was concentrated and visualized by SDS-PAGE, Coomassie staining, and western blotting against the 6x-HIS tagged of the protein (Fig. 4). We see a successful shift in solubility of GOx from the insoluble to soluble fraction. To confirm the presence of GOx and not another protein of the same band size, we performed a Western blot. The blot showed little to no GOx in the 8 M extraction (4◦ C, #1) and only GOx in the soluble fraction (30◦ C, #2).
Figure 4. GOx refolding analysis. Refolded protein solutions were subjected to SDS-PAGE electrophoresis and stained with coomassie, as well as western blotted against anti-His antibodies.
To test the activity of our refolded protein we utilized the Molecular Probes Amplex Red Glucose/Glucose Oxidase Assay Kit (A22189), which allows for a one-step detection of glucose oxidase activity by coupling the production of H2 O2 to the activity of horseradish peroxidase (HRP). H2 O2 reacts with Amplex Red (a colorless molecule), in the presence of HRP, to yield resorufin (red-fluorescent product). This colorimetric assay directly couples the activity of glucose oxidase to the production of resorufin. We assayed our refolded solution of wild type GOx. In Fig. 5, we
Ng
The Manhattan Scientist, Series B, Volume 5 (2018)
53
can see GOx-Wt produces resorufin in the presence of glucose (as evidenced from the turned the solution turning pink), whereas the control stayed colorless. This provides strong evidence that there is the presence of active and functional GO-Wt in the solution.
Figure 5. Amplex red assay for GOx-Wt refolding. Negative is the reaction assay with no glucose added.
Discussion Unfortunately, the expression and isolation of GOx encountered many hurdles due to technical errors and unexpected results, which diverged from the results Holland et al. (2012) reported. However, they did their expression in yeast, which is a eukaryotic system, the same as A. niger. Taking this into account, it is not without merit that the protein was unstable in a prokaryotic system when overexpressed. The work of Witt et al. (1998) highlighted similar problems with GOx expression in E. coli, having to refold the insoluble protein from a denaturing solution. We tried several expression temperatures in E. coli hoping that a reduced expression rate and time might help stabilize the protein, but we were unsuccessful. To that end, we managed to extract the protein from the insoluble fraction and refold GOx into an active enzyme. While we have successfully refolding some portion of the protein we do concede that the protein is not in a pure state, and still needs further isolation to obtain greater characterization of the protein. At this point, the purified samples are quite contaminated with other proteins that were also extracted in the urea buffer. Urea is a general solubilizing agent and cannot expect to only extract GOx. Our next steps will be to pass our working solution over a nickel affinity column and other chromatography purification columns, as necessary, to obtain pure protein. We are unable to fully characterize the enzyme, its kinetics, and effectiveness at an anode until we have pure sample. While most of our effort was placed on expressing and isolating the wild type version of the protein, we now have a protocol established to move forward with the other variants. These protein expressions have already been performed at large scale (4 L each) and are ready to be isolated. Alternatively, we are building an expression vector for GOx production in yeast. Previous research has shown that GOx can be obtained in high yield, from the soluble fraction, when expressed in different species of yeast (Holland et al., 2012).
54
The Manhattan Scientist, Series B, Volume 5 (2018)
Ng
The enzyme, once characterized, will be delivered to Dr. Santulli’s group and they will test its affinity and activity at a gold nanowire anode. Dr. Santulli’s group is synthesizing the gold anode.
Acknowledgments This work was supported by the Michael J. ’58 and Aimee Rusinko Kakos endowed chair in science. The author would like to express her gratitude to Dr. Bryan Wilkins for mentoring her throughout this research. Thanks go to the Department of Chemistry and Biochemistry, for resources and equipment, and Dr. Rani Roy for her continued effort and support, especially with summer housing arrangements during the research stay.
References Azocleantech.com: Recycling Batteries and The Toxic Hazards of Battery Disposal. (2008). https://www.azocleantech.com/article.aspx?ArticleID=132 Cosnier, S., Gross, A. J., Goff, A. L., and Holzinger, M. (2016). Recent advances on enzymatic glucose/oxygen and hydrogen/oxygen biofuel cells: Achievements and limitations. Journal of Power Sources,325, 252-263. doi:10.1016/j.jpowsour.2016.05.133 Energizer.com: Energizer Product Safety Data Sheet. (2017, January). Retrieved February 1, 2018, from http://data.energizer.com/pdfs/carbonzinc psds.pdf Holland, J.T., Harper, J.C., Dolan, P.L., Manginell, M.M., Arango, D.C., et al. (2012) Rational Redesign of Glucose Oxidase for Improved Catalytic Function and Stability. PLoS ONE 7(6): e37924. doi:10.1371/journal.pone.0037924 Witt, S., Singh, M. and Kalisz, H.M. (1998, April). Structural and Kinetic Properties of Nonglycosylated Recombinant Penicillium amagasakiense Glucose Oxidase Expressed in Escherichia coli. Applied and Environ Microbiol. 64, 4. Wohlfahrt, G., Witt, S., Hendle, J. and Schomburg, D. (1999) 1.8 and 1.9 Å resolution structures of the Penicillium amagasakiense and Aspergillus niger glucose oxidases as a basis for modelling substrate complexes. Section D: Biological. Zecca, A., and Chiari, L. (2010). Fossil-fuel constraints on global warming. Energy Policy, 38 (1), 1-3. doi:10.1016/j.enpol.2009.06.068 Zhu, Z., Tam, T. K., Sun, F., You, C., and Zhang, Y. -H. (2014, January 21). A high-energy-density sugar biobattery based on a synthetic enzymatic pathway. Retrieved February 01, 2018, from https://www.nature.com/articles/ncomms4026
Xylem conductivity of primary, secondary, and tertiary veins of plant leaves Maya Carvalho-Evans∗ Laboratory of Plant Morphogenesis, Department of Biology, Manhattan College Abstract. The purpose of this study was to determine relationships between xylem conductivity in leaf veins, as a function of leaf areas. Water enters leaves through xylem cells in petiole and moves from primary veins to secondary veins to tertiary veins to feed areas. Leaves of 23 species of broad-leaved plants with percurrent leaf venation were tested. Xylem conductivities of primary veins were 0.0533 to 12.9, secondary veins were 0.00367 to 0.48, and tertiary veins were 0.0000297 to 0.055. Leaf areas of the 23 species were 7.25 to 311 cm2 for primary areas, secondary areas were 0.262 to 13.9 cm2 , and tertiary areas were 0.0395 to 2.89 cm2 . When xylem conductivities were plotted as a function of leaf areas the slopes of primary, secondary and tertiary veins were 0.044, 0.031, and 0.0085, respectively, with r2 vales between 0.68 and 0.83. Xylem conductivities were well scaled to corresponding leaf areas, indicating that leaves are well supplied with water.
Introduction Leaves are fundamental to plant function, providing carbohydrates, through photosynthesis to the remainder of the plant. Stomata control the uptake of carbon dioxide and the loss of water via transpiration. When stomata are open, leaves lose water; thus, leaves are continuously losing water. As a result, leaves must obtain water from stems and transport that water effectively to all parts of a leaf (Fig. 1). The structure, arrangement and function of veins may determine the distribution and productivity of water to the leaf. Among all plant species, there are diverse vein patterns. For dicotyledonous plants, there are two predominant leaf patterns, reticulate and percurrent venation. Reticulate venation has many interconnected veins, with little regularity of pattern (Sack and Scoffoni, 2013). In percurrent leaves, primary veins move water to secondary veins and in turn, secondary veins move water to tertiary veins (Fig. 2). Tertiary veins extend from one secondary vein to a second secondary vein to form distinct secondary and tertiary areas (Fig. 3). Primary, secondary, tertiary and quaternary veins distribute water to leaf areas. Within veins, xylem cells are responsible for water transport (Fig. 4). Xylem vessels are located within veins, from roots to leaves. Each vein has an area to supply water to. Water enters the leaves primary vein from the petiole at the base of the leaf. The xylem cells of the primary vein feed the entire leaf area by then distributing the water to the secondary veins (Fig. 5) and then into the tertiary veins (Fig. 6). Vascular plants have veins which consist of vascular bundles that transport nutrients and water from the roots to the entire plant. Mature xylem vessels are columns of dead cells with thick, strong cellulose cell walls with a hollow lumen ∗
Research mentored by Lance Evans, Ph.D.
56
The Manhattan Scientist, Series B, Volume 5 (2018)
Carvalho-Evans
(Fig. 7). Water flows radially from vessel to vessel through perforations in a process known as transpiration. 2
Hâ‚‚O
Fig. 1.
Figure 1. Image of leaf (Magnolia x soulangeana Soul.-Boud) showing the path of water flow from the primary vein to a secondary vein to a tertiary vein. 3
Fig. 2.
Figure 2. Image of a leaf of (Magnolia x soulangeana Soul.Boud) illustrating primary (blue), secondary (orange), and tertiary (yellow) veins. Note that secondary veins always originate from the primary vein, while the tertiary veins always originate from the secondary veins.
Figure 4. Image of a leaf cross section (Cornus kousa BuĚˆrger ex Miq.) showing a primary vein and the lamella on each side of the primary vein. Blue arrows represent the movement of water through xylem vessels. Circled in yellow is a vascular bundle with a total of 146 xylem vessels.
4
5
Fig. 3.
Figure 3. AImage of a leaf of (Magnolia x soulangeana Soul.-Boud) illustrating the primary area (surrounded by blue line), secondary areas (surrounded by orange lines), and tertiary areas (surrounded by yellow lines).
Lamella
Carvalho-Evans
The Manhattan Scientist, Series B, Volume 6 5 (2018)
Fig. 6.
Fig. 5.
Figure 5. Image of a leaf cross section (Hyrandgea arborescens L) showing a secondary vein and the lamella on each side of the secondary vein. The vascular bundle has a total of 18 xylem vessels.
57
Figure 6. Image of a leaf cross section (Hyrandgea arborescens L) showing a tertiary vein and the lamella on each side of the tertiary vein. The vascular bundle has a total of 9 xylem vessels. 8
Figure 7. Image of a cross section of part of a vascular bundle of a leaf of (Hyrandgea arborescens L) showing xylem cells (X). Xylem cells (vessel cells) are rounded with thick cell walls. Additional xylem cells that are not vessel cells surround vessel cells but do not conduct water. Fig. 7.
Larger areas need higher xylem vessel conductivity to distribute water. If a vein cannot adequately supply water to its area, that area will die. Primary veins are larger, have more xylem vessels and higher xylem conductivity, to provide water to large areas. Secondary veins have fewer xylem vessels than primary veins. Tertiary veins are smaller, with fewer xylem vessels, only needing to provide water to very small areas. Therefore, this experiment tests if larger leaf areas will have larger xylem conductivities (direct correlation) and as leaves enlarge, there is more conductivity in veins.
Materials and Methods Tree species sampled Twenty-three (23) species of plants that had percurrent leaves were sampled (Table 1). All samples were obtained from the Manhattan College campus and Van Cortland Park. Identification of species was done using Kershner et al. (2008), www.tropicos.org and dendro.cnre.vt.edu.
58
The Manhattan Scientist, Series B, Volume 5 (2018)
Carvalho-Evans
Table 1. Leaf areas (in cm2 ) for twenty-three percurrent leaves Species Amelanchier arborea (F. Michx.) Fernald Asclepias syriaca Betula alleghaniensis Britton Betula papyrifera Carpinus caroliniana Walter Carya tomentosa Catalpa speciosa (Warder) Engelm Cornus kousa BuĚˆrger ex Miq. Euphorbia pulcherrima Wild. Ex Klotzsch Hibiscus rosa-sinensis L. Hydrangea arborescens L. Lantana camara L. Liriodendron tulipifera L. Magnolia x soulangeana Soul.-Boud Malus pumila Mill. Morus rubra Lour. Ostrya virginiana Britton, Sterns & Poggen b. Phytolacca americana Salix nigra Marshall Tilia americana L. Tilia platyphyllos Scop Ulmus pumila L. Viburnum lentago L. Mean Standard Deviation
Entire leaves 66.7 235 67.4 44.9 43.5 113 311 29 26.4 82.6 60.8 16.4 59.8 125.9 58.7 80.5 26.4 197 83.4 66.8 74.2 7.25 120.4 86.8 72.9
Secondary areas 2.35 6.61 1.53 2.14 2.304 3.37 13.9 2.62 1.3 6.29 3.35 1.025 4.48 5.27 4.12 2.88 1.37 9.2 2.11 2.81 3.62 0.262 5.79 3.86 3.04
Tertiary areas 0.335 0.417 0.135 0.224 0.189 0.503 2.770 0.195 0.189 0.767 0.255 0.119 0.433 1.162 0.273 0.247 0.144 2.900 0.243 0.234 0.225 0.0395 0.368 0.537 0.762
Leaf area measurements Photographic images for each leaf were obtained with a ruler for comparison purposes. Images were uploaded to Microsoft Paint (www.mircosoft.com) to trace leaf areas. Primary leaf areas consisted of the entire leaf while secondary leaf areas (traced) consisted of bisecting area between two secondary veins on each side of an individual secondary vein. Tertiary areas were traced using the same method as secondary areas (Fig. 3). All images were downloaded to Image J (National Institute of Health: https://imagej.nih.gov/ij) and primary, secondary, and tertiary areas were measured. The mean number of secondary and tertiary areas measured were 8 and 10, respectively. Tissue sampling Cross sections of primary, secondary and tertiary veins were excised and fixed in FAA (Jensen, 1962). Tissues were dehydrated through a tertiary butanol series and embedded in a Paraplast wax. Tissues are cut with a rotary microtome at 35 Âľm and placed on microscope slides. Tissues were stained with safranin and made permanent with Canada Balsam.
Carvalho-Evans
The Manhattan Scientist, Series B, Volume 5 (2018)
59
Xylem cell measurements From the processed veins, the total number of xylem cells (vessel cells) were determined. Vessels were considered to be circular with thick cell walls. No xylem fibers were counted in this study. Thus only vessel cells were counted. Photographic images of tissues were downloaded to ImageJ to determine diameters of xylem vessels. For primary veins, secondary veins, and tertiary veins 13, 10, and 7 xylem vessels were evaluated, respectively. Xylem conductivity measurement were made in accordance with McCulloh et al. (2009). Since vessels are rarely perfectly circular (Tyree and Zimmermann, 2002) the derived conductivities are only approximations. Two diameters measured at right angles were obtained for each vessel (Fig. 7). Mean diameter values were converted to mean radii values. Weighted mean radii were used for tissue xylem conductivities (McCulloh et al., 2009). Xylem conductivity for each vein was calculated (in units of g·cm·MPa−1 ·s−1 ) using the Hagen-Poiseuille equation: π · number of conduits · average radius of conduits (cm)4 . 8 · viscosity of water Fully-enlarged leaf experiment For the hypothesis, larger leaves will have larger xylem conductivities, 23 percurrent leaf species were studied. For each species, two trees were sampled. One leaf from each tree was processed. Each selected leaf was considered to be fully-enlarged for the species and no blemishes were present on any leaf. Leaf enlargement experiment To determine changes in leaf areas and xylem conductivity as leaves enlarge, only leaves of viburnum (Viburnum lentago L.) were analyzed. For viburnum, six leaf sizes were analyzed. For each size, three individual leaves were sampled.
Results Areas of entire leaves, secondary leaf areas, and tertiary leaf areas for the twenty-three species of this studied are showed in Table 1. Conductivities of primary, secondary and tertiary veins are shown in Table 2 for the 23 species of this study. Fully-enlarged leaf experiment Larger leaf areas will have larger xylem conductivities Primary veins have a high number of xylem cells. Compared to primary veins, number of xylem cells in secondary veins was about 4.13:1. Compared to secondary veins the number of xylem cells in tertiary veins is 3.78:1. The number of xylem cells in primary veins ranged from 35 to 355. The number of xylem cells in secondary veins ranged from 11 to 108.5. The number of xylem cells in tertiary veins ranged from 3 to 13. Average xylem vessel diameters in primary veins ranged from 24 to 68 µm. Average xylem
60
The Manhattan Scientist, Series B, Volume 5 (2018)
Carvalho-Evans
Table 2. Xylem conductivities (g·cm·MPa−1 ·s−1 ) for twenty-three percurrent leaves Species Amelanchier arborea (F. Michx.) Fernald Asclepias syriaca Betula alleghaniensis Britton Betula papyrifera Carpinus caroliniana Walter Carya tomentosa Catalpa speciosa (Warder) Engelm Cornus kousa Bürger ex Miq. Euphorbia pulcherrima Wild. Ex Klotzsch Hibiscus rosa-sinensis L. Hydrangea arborescens L. Lantana camara L. Liriodendron tulipifera L. Magnolia x soulangeana Soul.-Boud Malus pumila Mill. Morus rubra Lour. Ostrya virginiana Britton, Sterns & Poggen b. Phytolacca americana Salix nigra Marshall Tilia americana L. Tilia platyphyllos Scop Ulmus pumila L. Viburnum lentago L. Mean Standard Deviation
Primary veins
Secondary veins
Tertiary veins
0.302 12.2 1.22 0.553 0.476 2.68 12.9 0.134 0.0605 0.339 0.467 0.153 1.123 1.24 0.222 0.647 0.973 5.08 0.0533 0.777 0.123 0.178 0.96 1.86 3.55
0.0851 0.311 0.00876 0.0116 0.00801 0.0452 0.48 0.0164 0.00828 0.0106 0.00853 0.0221 0.155 0.0333 0.00823 0.12 0.0264 0.165 0.00963 0.0594 0.00367 0.00598 0.199 0.0783 0.118
0.000588 0.000494 0.000234 0.000466 0.000195 0.000132 0.0255 0.00017 0.000839 0.000996 0.00079 0.00159 0.0063 0.0015 0.000127 0.000156 0.000157 0.00238 0.000309 0.000219 0.0000297 0.0000513 0.00071 0.00191 0.00531
vessel diameters in secondary veins ranged from 18 to 43 µm. Average xylem vessel diameters in tertiary veins ranged from 8 to 28 µm. Primary vein conductivities ranged from 0.0533 to 12.9 with a mean of 1.86 g·cm·MPa−1 ·s−1 . Secondary vein conductivities range from 0.00367 to 0.48 with a mean 0.0783 g·cm·MPa−1 ·s−1 . The tertiary vein conductivities range from 0.0000297 to 0.055 with a mean of 0.00191 g·cm·MPa−1 ·s−1 (Table 2). Therefore on average, secondary conductivities were 4% of primary conductivities and tertiary conductivities were 7% of secondary conductivities. Leaf areas of the 23 species ranged from 7.25 to 311 cm2 with a mean of 86.8 cm2 . Secondary areas of the 23 species ranged from 0.262 to 13.9 cm2 with a mean of 3.86 cm2 . Tertiary areas of the 23 species ranged from 0.0395 to 2.89 cm2 with a mean of 0.537 cm2 (Table 1). Therefore on average, secondary areas were 4% of primary areas and tertiary areas were 14% of secondary areas. Average xylem conductivities of primary veins were positively correlated with entire leaf areas for the 23 species (Fig. 8). Average xylem conductivities of secondary veins were positively
Carvalho-Evans
The Manhattan Scientist, Series B, Volume 5 (2018)
61
correlated with secondary leaf areas for the 23 species (Fig. 9). Average xylem conductivities of tertiary veins were positively correlated with tertiary leaf areas for the 23 species (Fig. 10). For all three comparisons, r2 ranged from 0.68 to 0.83. Slope comparisons between secondary vein/areas and primary veins/areas was 0.70. Slope comparisons between tertiary vein/areas and secondary 9 veins/areas was 0.27. 10 Average xylem conductivity (g·cm·MPa-1·s-1)
Average xylem conductivity (g·cm·MPa-1·s-1)
15
10
5
0 0
Fig. 8.
100
200 Leaf Area (cm²)
300
0.5
0.0 0
Fig. 9.
Figure 8. Relationship between average xylem conductivity of the primary vein and total leaf area for twenty-three species of herbaceous plants xylem conductivity (yAverage = 0.044x − 1.98; r2 = 0.83).(g·cm·MPa-1·s-1)
5 10 Secondary Area (cm²)
15
Figure 9. Relationship between average xylem conductivity of the secondary veins and secondary leaf areas for twenty-three species of herbaceous plants (y = 0.031x − 0.046; r2 = 0.68).
0.03
0.15
0.02
0.10
Xylem conductivity (g·cm·MPa-1·s-1)
Average xylem conductivity (g·cm·MPa-1·s-1)
11
0.01
0.00
0.05
0.00
0
Fig. 10.
12
1 2 Tertiary Area (cm²)
Figure 10. Relationship between average xylem conductivity of the tertiary veins and tertiary leaf areas for twenty-three species of herbaceous plants (y = 0.0085x − 0.0018; r2 = 0.82).
0
3
50
100
Leaf Area (cm²) Fig. 11.
Figure 11. Relationship between average xylem conductivity of the primary veins and primary leaf area for viburnum (Viburnum lentago ) (y = 0.0009x − 0.0107; r2 = 0.83).
Leaf enlargement experiment As leaves enlarge, there is more conductivity in veins The expectation for the leaf enlargement experiments was that as leaf areas enlarge, the xylem conductivities should increase proportionally. For leaves of Viburnum lentago entire leaf areas enlarged from 8.38 to 123.9 cm2 . Conductivities
62
The Manhattan Scientist, Series B, Volume 5 (2018)
Carvalho-Evans
of primary veins increased from 0.00289 to 0.145 g·cm·MPa−1 ·s−1 (Fig. 11). Secondary areas enlarged from 0.37 to 7.67 cm2 . Conductivities of secondary veins increased from 0.000544 to 0.0189 g·cm·MPa−1 ·s−1 (Fig. 12). Tertiary areas enlarged from 0.0365 to 0.477 cm2 . Conductivities of tertiary veins increased from 0.000000932 to 0.0000919 g·cm·MPa−1 ·s−1 (Fig. 13). r2 values for primary, secondary, and tertiary veins were 0.83, 0.88, and 0.31, respectively. 13 14 10 Conductivity ×105 (g·cm·MPa-1·s-1)
Xylem conductivity (g·cm·MPa-1·s-1)
0.02
0.01
0
0.00 0
0
5
Secondary Areas (cm²) Fig. 12.
5
0.1
0.2
0.3
0.4
0.5
Tertiary Areas (cm²)
Figure 12. Relationship between average xylem conductivity of the secondary veins and secondary leaf areas for viburnum (Viburnum lentago L.) (y = 0.00023x − 0.0005; r2 = 0.88).
Fig. 13.
Figure 13. Relationship between average xylem conductivity of the tertiary veins and tertiary leaf area for viburnum (Viburnum lentago L.) (y = 0.00008x + 0.00003; r2 = 0.31)
Discussion Xylem cells transport water throughout plants. Primary, secondary, and tertiary veins must supply the water to all leaves. Water distribution occurs as primary veins move water to secondary veins and in turn, secondary veins move water to tertiary veins. Individual tertiary areas receive water from two secondary veins (S) and two tertiary veins (T) (Fig. 14). Within each tertiary area, quaternary veins (Fig. 15) distribute water. All quaternary veins originate from secondary and tertiary veins. Xylem vessels within quaternary veins supply water throughout tertiary areas. There are many quaternary veins per tertiary area. Secondary and tertiary veins must adequately supply water to all of the quaternary veins. The ratio of conductivity to area is much larger for the tertiary veins because they must supply to all quaternary veins. Since most leaf cells survive for many weeks, water movement throughout leaves is very efficient. If each primary vein provides water to 19 secondary veins (average number of secondary veins per primary vein) that have average conductivity of 0.0783, then the total of secondary conductivities per primary vein is 1.48 g·cm·MPa−1 ·s−1 . This is similar to primary conductivity of 1.86 g·cm·MPa−1 ·s−1 . If each secondary vein provides water to 25 tertiary veins with an average conductivity of 0.00191 then we get at total conductivity for tertiary veins of 0.04775 g·cm·MPa−1 ·s−1 . If there are 30 quaternaries per secondary vein and they have an average conductivity of 0.001 then we get 0.030 g·cm·MPa−1 ·s−1 more conductivity per secondary vein. The sum of tertiary
Carvalho-Evans
The Manhattan Scientist, Series B, Volume 5 (2018)
63 16
15
Fig. 14.
Figure 14. Leaf areas enclosed by black lines receive water from two secondary veins (S) and two tertiary veins (T). Only the segments of those veins enclosed by the black lines are involved with the corresponding leaf area.
Fig. 15.
Figure 15. Quaternary veins traced by black lines help distribute water to a tertiary area. Quaternary veins are connected to the surrounding secondary and tertiary veins.
and quaternary conductivities per secondary vein is 0.0778, which is very similar to the average conductivity for a secondary vein 0.0783 g·cm·MPa−1 ·s−1 (Table 3). Table 3. Average data for twenty-three percurrent leaves Areas (cm2 ) Species Mean
Entire leaves
Secondary areas
Tertiary areas
Ratio Entire/Sec
Ratio Sec/Ter
86.8
3.86
0.537
22.5
7.19
Ratio Pri/Sec
Ratio Sec/Ter
0.182
0.366
Vein lengths (mm) Species Total mean
Primary lengths
Secondary lengths
Tertiary lengths
136
747
2040 −1
Xylem conductivity (g·cm·MPa Species Mean
Primary conductivity
Secondary conductivities
Tertiary conductivities
1.86
0.0783
0.00191
·s
−1
)
Ratio Pri/Sec 23.7
41 −1
Vein lengths (mm) × Xylem conductivity (g·cm·MPa
Species Mean
Ratio Sec/Ter
Primary lengths
Secondary lengths
Tertiary lengths
Ratio Pri/Sec
253
58.5
3.9
4.3
·s
−1
)
Ratio Sec/Ter 15
64
The Manhattan Scientist, Series B, Volume 5 (2018)
Carvalho-Evans
Enlarging leaves of Viburnum lentago L. from 8.38 to 124 cm2 were analyzed. Conductivities of primary veins increased from 0.00289 to 0.145 g·cm·MPa−1 ·s−1 . Secondary areas enlarged from 0.37 to 7.67 cm2 with conductivities increasing from 0.000544 to 0.0189 g·cm·MPa−1 ·s−1 . Tertiary areas enlarged from 0.0365 to 0.477 cm2 with conductivities increasing from 0.000000932 to 0.0000919 g·cm·MPa−1 ·s−1 . Xylem vessel conductivities verses areas for primary, secondary and tertiary veins have r2 values of 0.83, 0.88 and 0.31 respectively. Five more species: Betula alleghaniensis Britton, Catalpa speciosa (Warder) Engelm, Hibiscus rosa-sinensis L., and Magnolia x soulangeana Soul.-Boud, will be added to the leaf enlargement experiment. Xylem vessel conductivity for primary, secondary, and tertiary veins will be compared to the enlarging area of these species. For leaves of Viburnum lentago L samples of the primary vein were taken in the middle and at the base, close to the petiole. Conductivities for primary middle and primary base were compared to entire leaf area. Conductivities of primary base vein ranged from 0.0269 to 0.181 g·cm·MPa−1 ·s−1 . Conductivities of primary middle vein ranged from 0.00289 to 0.105 g·cm·MPa−1 ·s−1 .
Acknowledgments The author is indebted to the Catherine and Robert Fenton Endowed Chair in Biology to Dr. Lance S. Evans for financial support for this research.
References Sack, L. and Scoffoni, C. (2013). Leaf venation: structure, function, development, evolution, ecology and applications in the past, present and future. New Phytol, 198: 983-1000. doi: 10.1111/nph.12253 McCulloh, K.A., Sperry, J.S., Meinzer, F.C., Lachenbruch, B., and Arala, C. (2009). Murray’s law, the ‘Yarrum’ optimum, and the hydraulic architecture of compound leaves. New Phytologist 184: 234–244. Jensen, W.A. (1962). Botanical Histochemistry - Principles and Practice. W.H. Freeman. San Francisco, CA. Kershner, B., Tufts, C., Mathews, D., Nelson, G., Spellenberg, R, Thieret, J. W., Purinton, T., Block, A., and Moore, G. (2008) National Wildlife Federation Field Guide to Trees of North America Tyree M.T., Zimmermann M.H. 2002. Xylem structure and the ascent of sap. Berlin, Germany: Springer Verlag.
Geographic variation in body size and sexual size dimorphism of North American ratsnakes Alexander J. Constantine∗ Department of Biology, Manhattan College Abstract. Understanding patterns of geographical variation in body size among and within species has long been a goal of evolutionary ecology because body size affects nearly all life-history traits of an organism. Despite the identification of well-defined patterns accompanied by explanatory hypotheses, there is difficulty in determining if such a universal pattern and explanation exists. Significant research has shown that in endotherms (i.e. mammals and birds), organismal body size increases as latitude increases, and decreases toward the equator. However, recent research has shown this pattern to be the opposite in ectotherms (i.e. reptiles). I explored the body size variation and sexual size dimorphism of North American ratsnakes from the archives of the American Museum of Natural History and studied them across their geographic range. There was no trend in ratsnake body size across their geographic range, but sexual size dimorphism was favored in males as they were generally larger than females. Since there was no evidence of a latitudinal trend in body size, it could be possible that more complicated factors are driving the body size of ratsnakes across their geographic range.
Introduction A major goal of organismal biology is to understand how the environment effects organisms because body size affects nearly all life-history traits of an organism (Brown et al., 2004). Despite the identification of well-defined patterns accompanied by explanatory hypotheses, there is difficulty in determining if universal patterns and explanations exist for all organisms. One well known rule of how organisms interact with their environment dates back to the 1800s, proposed by the renowned biologist Carl Bergmann. According to Bergmann’s rule (1847), body size is directly proportional to changes in latitude – as one increases so does the other. This is true within many taxa (especially birds and mammals), where organism size increases at higher latitudes and decreases toward the equator (Bergmann, 1847). Bergmann hypothesized that this proportionality was a result of lower surface to volume ratios thus giving larger individuals an energetic advantage at higher latitudes. Organisms will be larger at higher latitudes, where it gets colder, because bigger body size means more heat retention for that organism. However, there are exceptions to the above rule: different patterns may be observed in other taxa (i.e. amphibians, reptiles, and fishes), where body size does not proportionally increase with latitude (Ashton and Feldman, 2003; Olalla-Tarraga et al., 2006). For example, a recent study examined body sizes of ratsnakes, belonging to the class Reptilia (Degregorio et al., 2018). These snakes were found to be larger around core latitudes within their range, and smaller near latitudes on the peripheries of their range (both north and south). Reptiles might be smaller at higher latitudes since growing seasons there are shorter (Blanckenhorn et al., 2006), so they might spend ∗
Research mentored by Gerardo Carfagno, Ph.D.
66
The Manhattan Scientist, Series B, Volume 5 (2018)
Constantine
more time hibernating rather than being active and in turn growing, but larger at the center where conditions are more ideal. Additionally, animal behavior could play a role in size differences between sexes. Male ratsnakes are known to engage in fights with one another over a female (Gillingham, 1980), causing males to be larger than females as larger males seize the reproductive advantage when having a bigger body size; therefore, males tend to be larger than females (Blouin-Demers et al., 2005). It is important to understand these patterns of geographical variation in body size among ratsnakes, as they are an exception to the rule that animals increase in body size as latitude increases and could provide better comprehension of how animals other than mammals and bords vary geographically. My goal was to determine if the same patterns seen in the above referenced studies on wildcaught ratsnakes would also be observed in historic museum specimens of the same species. I first hypothesized that ratsnakes would be smaller at the peripheries of their geographic range and larger toward the center. My second hypothesis was that males would be larger in body size than females due to the inter-male combat for reproductive females.
Material and Methods
This research was conducted with specimens from the archives of the Herpetology Department of the American Museum of Natural History in New York City. The rat snakes studied were preserved in jars and in large buckets, each filled with a solution of 70% ethanol. Some of the oldest snakes we measured were a little over one hundred years old, and all of the preserved specimens were coiled (in rigor mortis). The museum has an extensive collection of over 1,500 ratsnakes, but only 392 of them met my requirements of being located along the east coast and with a snout vent length (SVL) of over 80 cm in length as this is the smallest reported adult size (Ernst and Ernst, 2003). To explore variation in body size of ratsnakes across their geographic range, we used the standard snout to vent length (SVL) measurement in cm (Blouin-Demers et al., 2003), as well as tail length. The SVL measurement was measured in case of a missing or damaged tail, however the total length of each snake was still calculated through addition of the tail length measurement when available. The SVL was measured using a lengthy piece of twine that we placed along the dorsal side of the snake from its snout down to its vent. For the sexual size dimorphism, we calculated the sexual size dimorphism index (SSDI) with the following standard formula (Lovich and Gibbons, 1992):
mean male size + 1
Ă— 100. SSDI =
− mean female size
If the resulting value was negative, that meant the males were larger than females within that population, and if the value was positive, then that meant the females were larger than males.
We were also interested to see if other body measurements might vary with a standard pattern across a geographic range. Therefore, we also measured average head width and head length. We
Constantine
The Manhattan Scientist, Series B, Volume 5 (2018)
67
chose these variables because we assumed that head dimensions might be correlated with what a snake can eat. Therefore, differences in head size might reflect differences in available food items across a geographic range. Electronic calipers were used to measure the length of the head from the snout to the jaw bone, and the width of the head was measured behind the eyes. We calculated mean and standard error bars for all of the measured data.
Results A total of 392 specimens were measured – 208 males and 184 females – from 11 different states trending from north to south along the east coast of the US. Regarding geographic variation in body size, SVL measurements showed a variation in snake sizes along the north to south east coast range (Fig. 1); there was no clear trend in size latitudinally. However, some central states such as Tennessee, North Carolina and South Carolina showed smaller snakes than in states at higher latitudes like in Rhode Island and New York where the snakes there were larger (Fig. 1). The most northern state of Rhode Island averaged the longest snake SVL (135.54 cm), where the most central state of South Carolina averaged the shortest SVL (110.85 cm). This is opposite of the initial prediction that central states would yield larger snakes than at the peripheries.
SVL (cm)
150
100
50
0 RI
CT
NY
TN
NC
SC
GA
AL
LA
TX
FL
State Figure 1. Average snout vent length (SVL) of all ratsnakes per state. The x-axis shows states trending down the north to south range. The y-axis shows the average SVL measured in cm.
Regarding examples of body size dimorphism between males and females, tail length measurements showed generally, in all states, males had longer tails than females (Fig. 2). All male snakes measured averaged a tail length of about 26 cm compared to the total female average tail length of 23.76 cm (Fig. 2). This is not surprising, as female snake tails consist largely of muscle while males store their hemipenes (genitalia) within their tails, which explains why male tails are longer (King, 1989).
The Manhattan Scientist, Series B, Volume 5 (2018)
Avg Tail Length (cm)
68
Constantine
25
20
Male
Female
Gender Figure 2. Male versus female average tail length. The x-axis shows gender. The y-axis shows the average tail length measured in cm.
The second example of body size dimorphism between males and females was found using SSDI, where its measurements showed male body size was larger than females in most states (Fig. 3). The horizontal line presented in the graph is the 0 line. Anything below this line represents a negative value meaning that males are larger than females in a given state, which is generally true, with the exception of snakes in North Carolina and Louisiana. North Carolina and Louisiana displayed positive values meaning females were larger than males in those states (Fig. 3). These results generally supported my hypothesis that body size in males would be larger than in females due to intermale combat, which favors larger male body size.
SSDI of M + F (cm)
0.2
0
-0.2
-0.4
RI
CT
NY
TN
NC
SC
GA
AL
LA
TX
FL
State Figure 3. Sexual size dimorphism index (SSDI). The x-axis shows states trending down the north to south range. The y-axis shows the SSDI measured in cm.
Measurements for head width and head length varied along the geographic range. The two
Constantine
The Manhattan Scientist, Series B, Volume 5 (2018)
69
Headwidth of M + F (cm)
head measurements had correlating data (i.e., snakes with longer heads had wider heads) so only the head width data is shown. This variable is likely to be the more important one, as head width usually correlates with what species snakes can feed upon (Madsen, 2011). Snakes from North Carolina, Alabama and Texas all had smaller head sizes (1.65-1.8 cm wide) compared to Rhode Island, South Carolina and Louisiana (1.95-2.03 cm wide) (Fig. 4). South Carolina is particularly interesting, in that this state had snakes with the smallest with the largest heads, indicating head sizes are not simply a by-product of overall body size. Smaller head widths might reflect a diet of smaller prey such as lizards, compared to larger head sizes where these snakes may feed on larger prey such as birds, small rodents, or other mammals (Madsen, 2011). 2
1
0
RI
CT
NY
TN
NC
SC
GA
AL
LA
TX
FL
State Figure 4. The average head width of males and females per state. The x-axis shows states trending down the north to south range. The y-axis shows the average head width measured in cm.
Discussion I found that the body size of ratsnakes did not vary latitudinally but did vary geographically (Fig. 1). Despite reports of a latitudinal trend amongst reptiles from where snakes should be smaller toward the peripheries (north and south) and larger toward the central latitudes of their geographic range (Ashton and Feldman, 2003) where conditions for snake growth are more ideal (Degregorio et al., 2018), my research instead displayed a variation of large and small snake populations in northern, central, and southern states alike. This observation of different states having smaller or larger snakes might reflect a potential difference in prey availability in these areas where diet may outrank latitudinal effects on snake size. However, more research along these lines would be needed to confirm this assumption. SSDI measurements proved to show most states had larger males than females, yet this data showed some variation. Despite there being variances in body size dimorphism among sexes, snake SSDI values in Alabama drastically favored males (-0.35 cm) over other states such as Rhode Island in the north (-0.10 cm), South Carolina at the center (-0.16 cm) and Florida in the south of the range (-0.06 cm). Generally speaking, the difference between male and female body size was what I expected. The results from both examples of dimorphism between males and females (tail length
70
The Manhattan Scientist, Series B, Volume 5 (2018)
Constantine
and SSDI) supported my hypothesis that males should be larger than females because ratsnakes engage in intermale combat (Gillingham, 1980). In terms of SSDI, most states showed negative values indicating larger males than females in those states (Fig. 3). However, some states brought exceptions to my hypothesis. Females in North Carolina and Louisiana were relatively larger than in other states. This could be because larger female body sizes allow for greater fecundity in egg production (Ford and Seigel, 1989). Perhaps in these states there is a greater likelihood for egg predation, thus selecting for larger female sizes. Again, further study would be needed to confirm this assumption Our head width and head length data could help to explain why there is not a regular snake size latitudinal trend. Head sizes usually correlate with what species a snake can feed upon. Thus, snakes that displayed smaller head widths in North Carolina, Alabama, and Texas most likely feed on smaller prey like tiny lizards, where in the states where head width is larger, the prey species are likely to be larger (Madsen, 2011). Since snakes are ectotherms (they rely on the environment to control body temperature), they convert food into energy that is used largely toward reproduction and growth (Flouris et al., 2014). Therefore, it is possible that the larger the prey species, the more energy snakes will have to put toward growth. With this information, a new hypothesis could be stated that “a ratsnake’s available diet drives body size dimorphism across a geographic range, and not necessarily latitude.” However, better information on what snakes are feeding on could give better insight to what is driving size differences across the ratsnake geographic range. My data may not have yielded a latitudinal trend in ratsnake populations along their geographic range due to the nature of my research methods. The studies that I referenced in my research all included wild, live specimens, where I was using preserved specimens found in the archives of the AMNH. My data may have been compromised by small sample sizes limited to the museum’s archival collection. For instance, from Rhode Island I measured 10 total snakes, but measured 54 total snakes from Florida. Data may have been limited from certain states as specimens may be collected preferentially from areas in which curators work. Museum specimen preference was a possible variable that may have also affected my outcomes. Museums may tend to select and preserve the largest specimens captured from the wild. This could be a reason why some northern states ended up having the largest snakes; small sample sizes factored in with the measurements of the largest snakes from that state may have overestimated average ratsnake sizes from those locations. Museums also tend to have specimens that vary in age. For example, I measured ratsnakes captured and preserved in 1902, but also snakes from the 1970s. This age gap can skew data since the snakes are not contemporaneous. Ratsnakes in 1902 may have been larger or smaller than ratsnakes in the 1970s and vice versa. This might be a weakness when collecting biological specimens from a museum archive. With these weaknesses in mind we were still able to collect in a short period of time, a large number of samples. This is the benefit to studying at a museum (Suarez and Tsutsui, 2004).
Constantine
The Manhattan Scientist, Series B, Volume 5 (2018)
71
For future study, perhaps an ideal scenario would be a mix of field populations as well as museum specimens from a variety of locations. In addition, I focused on one species. It would be interesting to determine what patterns might exist in other species from all around the world, facing different environmental and evolutionary pressures, for example. This is one area where museum study would have an advantage, as it would be easier to sample many populations in one location rather than having to travel all around the world.
Acknowledgments This work was funded by the Linda and Dennis Fenton ’73 Endowed Biology Research Fund. The author thanks the American Museum of Natural History for logistical support, as well as specimen sampling. He especially thanks his advisor, Dr. Gerardo Carfagno, for his patience, knowledge, and guidance throughout this research.
References Ashton, K.G., and Feldman, C.R. 2003. Bergmann’s rule in nonavian reptiles: turtles follow it, lizards and snakes reverse it. Evolution 57: 1151-1163. Bergmann, C. 1847. Über die Verhältnisse der Wärmeökonomie der Thiere zu ihrer Grösse. Göttinger Studien. 3, 595-708 Blanckenhorn, W.U., Dixon, A.F., Fairbairn, D.J., Foellmer, M.W., Gibert, P., Linde, K.V., Meier, R., Nylin, S., Pitnick, S., Schoff, C., and Signorelli, M., (2006). Prozimate causes of Rensch’s rule: does sexual size dimorphism in arthropods result from sex differences in development time? American Naturalist, 169(2), 245-257. doi:10.1086/510597. Blouin-Demers, G. et al. (2005). Genetic evidence for sexual selection in black ratsnakes (Elapahe obsoleta). Science Direct, 69, 225-234. doi:10.1016/j.anbehav.2004.03.012 Brown, J.H., Gillooly, J.F., Allen, A.P., Savage, V.M., and West, G.B. (2004). Toward a metabolic theory of ecology. Ecol. 85: 1771-1789. doi:10.1890/03-9000 DeGregorio, B., Blouin-Demers, G., Carfagno, G., Gibbons, J., W., Mullin, S., Sperry, J., Willson, J., D., Wray, K., Weatherhead, P., (2018). Geographic Variation in Body Size and Sexual Size Dimorphism of North American Ratsnakes (Pantherophisspp. sensu lato). doi:cjz-2018-0005.R2 Ernst, H.C., and Ernst, M.E. (2003). Snakes of the United States and Canada. Smithsonian Books, Washington, DC, USA. Flouris, D.A., Piantoni, C. (2014). Links between Thermoregulation and Aging in Endotherms and Ectotherms. Temperature, 2(1): 73-85. doi:10.4161/23328940.2014.989793 Ford, B.N., Seigel, A.R., (1989). Relationships among body size, clutch size, and egg size in three species of oviparous snakes. Herpetologica 45(1), 75-83. Gillingham, J.C., (1980). Communication and combat behavior of the black rat snake (Elaphe obsoleta). Herpetologica, 1, 120-127.
72
The Manhattan Scientist, Series B, Volume 5 (2018)
Constantine
King, B.R., (1989). Sexual dimorphism in snake tail length: sexual selection, natural selection, or morphological constraint? Biological Journal of the Linnaen Society, 38(2), 133-154. doi:10.1111/j.1095-8312.1989.tb01570.x Lovich, E.J., and Gibbons, W.J., (1992). A review of techniques for quantifying sexual size dimorphism. Winter, 56(4), 269-281. Madsen, T., (2011). Body condition and head size in snakes. Amphibia-Reptilia, 32(4), 565-567. doi:10.1163/156853811X610339 Olalla-Tarraga, M.A., Rodriguez, M.A., and Hawkins, B.A., (2006). Broad-scale patterns of body size in squamate reptiles of Europe and North America. Journal of Biogeography, 33, 781-793. doi:10.1111/j.1365-2699.2006.01435.x Suarez, V.A., and Tsutsui, D.N., (2004). The value of museum collections for research and society. BioScience, 54(1), 66-74. doi:10.1641/0006-3568(2004)054[0066:TVOMCF]2.0.CO;2
Bark formation for Neobuxbaumia tetetzo and Pachycereus hollianus Phillip Dombrovskiy∗ Laboratory of Plant Morphogenesis, Department of Biology, Manhattan College Abstract. Previous studies have been conducted to examine both the physical characteristics and bark formation processes for long-lived columnar cactus species between the latitudes of 32◦ N and 32◦ S. Observations have shown that the formation of bark on such columnar cacti species involve rapid divisions of the epidermal cells. Previous studies claim that at a latitude of 18◦ N, southern-facing surfaces are said to experience greater direct sun exposure compared to northern-facing surfaces. An observational study was performed at the Tehuacán-Cuicatlán Biosphere Reserve in Tehuacán Valley, Mexico (18◦ N, 97◦ W) with the help of microscopy and technology. The intention of this research was to (1) study the superficial anatomy of excised tissue samples from Neobuxbaumia tetetzo and Pachycererus hollianus, (2) establish the bark pattern for both species, and (3) determine if south-facing surfaces upon the cacti possess greater sunlight-induced bark than the north-facing surfaces. Variances in the morphology and surface tissue anatomy were noted between Neobuxbaumia tetetzo and Pachycererus hollianus. Evidence was acquired in support of south-facing cactus surfaces obtaining more bark than north-facing cactus surfaces.
Introduction The formation of plant bark has been noted in cacti species such as Carnegiea gigantea, as well as Echinopsis tarijenesis, thus spanning between the 32◦ N and 32◦ S latitudinal lines (Evans et al., 1994; Evans and Cooney, 2015). Proliferation of damaged cacti surfaces (barked surfaces) has shown to lead to premature cactus death (Evans et al., 1995; 2003). The current18study examines the histologicalFig. components of cacti surfaces and bark, particularly at 18◦ N, 97◦ W, home to the 1. Tehuacán-Cuicatlán Biosphere Reserve in Tehuacán Valley, Mexico (Fig. 1). The Tehuacán Valley
Figure 1. Photographic image of several cactus species endemic to Tehuacán-Cuicatlán Biosphere Reserve, San Juan Raya (18◦ N, 97◦ W), Puebla, Mexico. ∗
Research mentored by Lance Evans, Ph.D.
74
The Manhattan Scientist, Series B, Volume 5 (2018)
Dombrovskiy
has a rich diversity of endemic cacti species due to the rain shadow effect of the Sierra Madre Oriental mountains (Smith 1965). The variety observed in the region is reflected by the characteristics of Neobuxbaumia tetetzo and Pachycereus hollianus (Figs. 2A and 2B), the species of interest (www.sanjuanraya.com). The purpose of this research was to understand bark coverages 19 on surfaces of Neobuxbaumia tetetzo and Pachycereus hollianus, and to study the tissue changes that occur during bark formation. Fig. 2. A
B
Figure 2. A: Image of a young plant of Neobuxbaumia tetetzo. B: Image of a plant of Pachycereus hollianus.
The first symptom for bark on cacti surfaces is a buildup of epicuticular waxes that result in poor gas exchange via the stomata structures (Evans and Macri, 2008). A previous study concluded enhanced levels of UV-B radiation increase the accumulation of such epicuticular waxes (Evans et al., 2001). Cellular divisions of the epidermal cells follow subsequently, resulting in darkly colored, elevated cacti surfaces formally labeled as true bark (Evans et al., 1994). The amount of bark present along the surfaces of the cacti correlates with the age of the cactus: while younger cacti have healthier surface tissues, older cacti have extensive bark (Evans et al., 1995; 2005). Death of the cactus follows the formation of bark since there is a lack of healthy tissues performing metabolic processes, such as photosynthesis and cellular respiration. 20
Fig. 3.
N Figure 3. Diagram of the concept that bark occurs first on southfacing surfaces and progresses to the north-facing surfaces.
W
E S a
Dombrovskiy
The Manhattan Scientist, Series B, Volume 5 (2018)
75
Studies completed in the past have concluded that the bark injuries are related to direction and are initially located on south-facing surfaces, before moving to the north-facing surfaces (Fig. 3; Evans et al., 1994). At the 18◦ N longitude, there is a ratio of 2:1 direct sunlight exposure that exists between south-facing surfaces and the north-facing surfaces (Geller and Nobel, 1984). A data set displaying the amount of direct sunlight hitting the four cardinal-facing directions (North, East, South, and West) of an upright pole at all latitudes across Earth, at any particular date of the year, was also completed (Geller and Nobel, 1984). The data for the sunlight exposure at the latitude of the biosphere reserve was averaged to provide the 2:1 ratio as aforementioned above. This average represents a period of one year, spanning from January 1 to December 31. Using this calculation, the relationships between the different cactus surfaces were surveyed. It was hypothesized that when obtaining the percent bark coverage data for Neobuxbaumia tetetzo and Pachycereus hollianus, approximately a 2:1 ratio of bark composition would persist between the south-facing and north-facing surfaces. It was also hypothesized that the anatomical characteristics of the tissue structures are different for both Neobuxbaumia tetetzo and Pachycereus hollianus.
Methods and Materials Site depiction The current study took place at the Tehuacán-Cuicatlán Biosphere Reserve, San Juan Raya, Puebla, Mexico (http://www.sanjuanraya.com) 18◦ N, 97◦ W, where the Neobuxbaumia tetetzo and Pachycereus hollianus cacti species were examined. The biosphere is a protected area meant to reduce human activity and ecosystem intrusion. This area is a semi-arid zone with the richest cactus biodiversity in all of North America (UNESCO). The area possesses valleys full of high columnar cactus densities (UNESCO). The main vegetation type is an arid tropical scrub residing in rocky soils (Valiente-Banuet and Ezcurra, 1991). Both the N. tetetzo and P. hollianus species were spatially distributed in a clustered manner. Bark coverage data were obtained from 60 cactus plants of Neobuxbaumia tetetzo and 77 cactus plants of Pachycereus hollianus. Anderson (2001) describes the characteristics for both N. tetetzo and P. hollianus. Bark surface coverage methods Densely populated areas were not used for this study – plants living in densely populated areas are subject to being shaded by other plants, thus limiting sunlight exposure. Various attempts were made to sample a wide diversity of healthy and damaged cacti. Cacti were sampled from both flat and hill-side terrains. All cacti were examined at approximately 1.7 meters above ground. Cacti were examined for their crests (vertical protrusions) and troughs (vertical indentations) between each crest (Anderson, 2001; Gibson and Nobel, 1986). Percentage of bark information and imaging in the form of photos was collected from the sampled cacti; 12 surfaces were examined. These surfaces include the crests (ribs) of the cacti in the cardinal-facing directions (South, East, North, West), as well as the right trough and left trough
Fig. 4.
76
The Manhattan Scientist, Series B, Volume 5 (2018)
Dombrovskiy
for each of the cardinal-facing crests (Fig. 4). Pictures of the right and left troughs were taken for each cardinal-facing direction with a point-and-shoot camera; the view of the crest was visible in both types of images. Surface views of Neobuxbaumia tetetzo (Fig. 5) and Pachycereus hollianus (Fig. 6) display the extent of bark in cacti of the Class IV grouping.
Crest
Trough
Figure 4. Image of a surface of Neobuxbaumia tetetzo showing crests and troughs. 23
22
Fig. 6.
Fig. 5.
A
B
C
D
C
A
D
B
Figure 5. Images of crest and trough surfaces of Neobuxbaumia tetetzo. A: The south-facing surface with extensive bark on the crest with some bark on the trough. B: The east-facing surface with bark on the crest and trough. C: The north-facing surface with bark on the crest and trough. D: The west-facing surface with bark on the crest and trough.
Figure 6. Images of crest and trough surfaces of Pachycereus hollianus. A: The south-facing surface with extensive bark on the crest and the trough. B: The east-facing surface with bark on the crest and trough. C: The north-facing surface with bark on the crest and trough. D: The west-facing surface with bark on the crest and trough.
Dombrovskiy
The Manhattan Scientist, Series B, Volume 5 (2018)
77
Bark data analyses All sampled cacti were then grouped into classes according to the percentage of bark located on the south-facing crest (south-facing rib). Class I possessed 0% to 24% bark on the south crest, Class II possessed 25% to 49% bark on the south crest, Class III possessed 50% to 74% bark on the south crest, and Class IV possessed 75% to 100% bark on the south crest. Microsoft Excel was used to organize physical data sheets collected at the biosphere reserve. The percent bark data was converted into standardized arcsin values, and statistical T-test analyses (Snedecor and Cochran, 1967) were conducted for crest to crest comparisons, crest to trough comparisons, and trough to trough comparisons (Tables 1 and 2). Table 1. Bark percentages on surfaces of Neobuxbaumia tetetzo. Upper case letters: Statistical analyses of crest comparisons. Lower case letters: Statistical analyses of crest to trough comparisons. Roman numerals: Statistical analyses of trough to trough comparisons. If letters are different, bark percentages were different at p < 0.05. South Cactus Class I II III IV
( 0 – 24% bark) (25 – 49% bark) (50 – 74% bark) (75 – 100% bark)
East
North
West
Crest
Right Trough
Crest
Right Trough
Crest
Right Trough
Crest
Right Trough
12A,a 38a 60A,a 92A,a
9b,I 14b 68b,I 63b
16 48a 32B 60B
10III 23b,I 28 51
14a 36a 28B 55B
5b,II 11b,II 25II 48
11B 38a 31B 59B
13III 13b,II 35 49
Table 2. Bark percentages on surfaces of Pachycereus hollianus. Upper case letters: Statistical analyses of crest comparisons. Lower case letters: Statistical analyses of crest to trough comparisons. Roman numerals: Statistical analyses of trough to trough comparisons. If letters are different, bark percentages were different at p < 0.05. South Cactus Class I II III IV
( 0 – 24% bark) (25 – 49% bark) (50 – 74% bark) (75 – 100% bark)
Crest
Right Trough
8A 33 62A,a 95A,a
15I 35 37b,I 82b,I
East
North
Crest
Right Trough
20B 55 53 76B
17I 48 53 64II
West
Crest
Right Trough
Crest
Right Trough
15A 25 44 55C
10II 15 28II 43III
9C 15 27B 39C
6III 13 18 34III
Field methods / Histological procedures Field pictures of the particular cacti were taken before the excision of the sample, during the excision of the sample, and of the sample alone. Roughly 2 cm × 1 cm × 1 cm tissue samples were acquired from a trough of each species at 1.7 meters above ground, with two cut samples per category. The relatively flat troughs were used as sample tissues due to the crests of the cacti
78
The Manhattan Scientist, Series B, Volume 5 (2018)
Dombrovskiy
possessing rounder surfaces, and the areoles preventing the excision of a sufficient sample. The samples were treated with a fixative solution of acetic acid and ethanol to kill the living plant cells and prevent decay of the tissues. The samples were stored in capped test-tubes (Jensen, 1962). Histological analysis Tissue samples were processed through a dehydration procedure and embedded in wax (Paraplast X-tra Tissue Embedding Medium; McCormick Scientific, St. Louis, MO). Microtome samples were cut at 35 to 45 Âľm and placed on microscope slides. Tissues were deparaffinized and stained with 5% Safranin and coverslipped with Canada balsam. At least 30 microscope slides were prepared for each species to be examined under an optical microscope at 40Ă&#x2014; and 100Ă&#x2014; magnification. Four tissue classes were established to analyze changes in the bark formation process. Class A samples exhibited no visible bark. Class B samples exhibited initial bark layers. Class C samples exhibited extensive bark coverage (50% to 74% bark coverage). Class D cacti samples exhibited complete bark coverage (75% to 100% bark coverage). Photographs of tissues were scanned into computer files.
Results Bark coverage It can be noted that upon observing the collected bark percentage data for Neobuxbaumia tetetzo, the initial hypothesis, which states that there was going to be a 2:1 ratio of bark coverage between the south-facing and north-facing surfaces, was supported by the crest data for both Class III and Class IV (Table 1). T-test analyses conducted for all classes between the crests displayed significant findings where the crest bark composition for the south-facing surfaces was different than the north-facing surfaces. For Class III of Neobuxbaumia tetetzo, the mean bark composition was 60% on the south-facing crest and 28% on the north-facing crest. For Class IV of Neobuxbaumia tetetzo, the mean bark composition was 92% on the south-facing crest and 55% on the northfacing crest. The mean bark composition between the south-right trough and the north-right trough for Class III was also statistically significant upon comparison. The south-right troughs had a mean bark composition of 68% and the north-right troughs had a mean bark composition of 25%. Pachycereus hollianus presented Class IV data that supported the hypothesis that there was going to be a 2:1 ratio of bark coverage between the south-facing and north-facing surfaces (Table 2). The difference in mean bark coverage between the south-facing crest and north-facing crest for Pachycereus hollianus was statistically significant. For Class IV, the south-facing crest possessed a mean of 95% bark, while the north-facing crest possessed a mean of 55% bark. The Class III data for Pachycereus hollianus depicted statistically significant evidence that supported a difference in mean percent bark between the south-right troughs and the north right troughs. Class III southright troughs shared a mean bark percentage of 37%, and the north right troughs shared a mean bark percentage of 28%.
Dombrovskiy
24 B, Volume 5 (2018) The Manhattan Scientist, Series
79
Fig. 7.
e A
B
C
h c
D
E
F c
G
H
I
J
K
L
a b
Figure 7. Images of surface, excised stem samples, and histological preparations of Neobuxbaumia tetetzo. A-C: Samples of a relatively healthy surface. A: Surface of a trough with little bark. B: Excised stem sample showing extensive green chlorenchyma and relatively light-colored parenchyma (interior). C: Histological sample that shows a single-celled epidermis with relatively thick cuticle (e), a multicellular hypodermis (h), and chlorenchyma (c). D-F: Samples of a surface with initial bark. D: Surface of a trough with bark. E: Excised stem sample showing a green chlorenchyma, a darkened layer just beneath the chlorenchyma, and a more yellowish-colored parenchyma. F: Histological sample that shows a single-celled epidermis (a), a multicellular epidermis that resulted from periclinal cell divisions (b), and a shedding of some epidermal layers during the bark process (c). The hypodermal cells (H) seem to have lost their secondary cell walls and have become deformed. The chlorenchyma is less structured. G-I: Samples of a surface with extensive bark. G: Surface of a trough with bark. H: Stem sample showing extensive bark covering, a green chlorenchyma, and a darkened layer just beneath the chlorenchyma. I: Extensive bark covering. The hypodermis is almost indistinguishable and the chlorenchyma is disorganized. J-L: Samples of a surface with complete bark. J: Surface of a trough with bark. K: Sample with extensive bark covering. The chlorenchyma is not green. L: Extensive bark development with cellular layers. The hypodermis cannot be distinguished and the chlorenchyma is not organized.
Surface and histological analyses Histological samples of Neobuxbaumia tetetzo of Class A had no visible bark (Figs. 7A, 7B). These samples had columnar, elongated epidermal cells (Fig. 7C). A single layer of cells was present for the epidermis, covered by a relatively thick cuticle layer visible under the microscope (Fig. 7C). Both the multicellular hypodermis and the chlorenchyma tissues were organized. Class B samples exhibited initial bark layers (Figs. 7D, 7E). Epidermal cells show many cell layers (Fig. 7F). These samples showed no cuticle layer. Moreover, the hypodermal layer was irregular coincident with irregularly shaped chlorenchyma cells. The image in Fig. 7F shows initial bark layers forming. Samples of Class C had extensive bark coverage (Figs. 7G, 7H, 7I). The bark layers have accumulated dark stains. The hypodermal layer was completely disrupted and the chlorenchyma cells were disorganized (Fig. 7I). Samples of Class D had complete bark coverage
80
The Manhattan Scientist, Series B, Volume 5 (2018)
Dombrovskiy
(Figs. 7J, 7K). Many bark layers were present with the formation of striations because of stacked epidermal cells (Fig. 7L). The hypodermis and parenchyma cells were no longer distinguishable. Histological samples of Pachycereus hollianus of Class A had no visible bark (Fig. 8A, 8B). The epidermis was composed of a single-layer of globular-shaped cells, with a thick cuticle layer (Fig. 8C). The cactus had a two-layered hypodermis with thin cell walls. The parenchyma was non-stratified. Cactus samples of Class B showed initial bark layers with a cuticle (Figs. 8D, 8E). Bark formation was initiated with periclinal cell divisions of epidermal cells. Simultaneously, hypodermal cells and chlorenchyma cells were disorganized (Fig. 8F). Samples of Class C showed extensive bark coverage (Figs. 8G, 8H). For these samples, there was a non-distinguishable hypodermis and chlorenchyma below dark-yellow bark cells with a cuticle (Fig. 8I). Cactus samples of25 Class D exhibited complete bark coverage with no hypodermis or cuticle (Figs. 8J, 8K, 8L). Fig. 8.
Figure 8. Images of surface, excised stem samples, and histological preparations of Pachycereus hollianus. A-C: Samples of a relatively healthy surface. A: Surface of a trough with no bark. B: Excised stem sample showing green chlorenchyma and parenchyma (interior). C: Histological sample that shows a single-celled, globular-shaped epidermal cells with a thick cuticle (e). A two-layered hypodermis is present with relatively thin cell walls (h). The chlorenchyma is not stratified (c). D-F: Samples of a relatively healthy surface. D: Sample of a surface that appears to have a thickened cuticle (grey-coloration). E: Stem sample showing initial bark covering. The chlorenchyma layer is smaller, and the parenchyma layer is a yellowish color. F: The histological sample shows a multi-layered bark (I). The cuticle (â&#x2C6;&#x2014;) is still present. Hypodermal cells are indistinguishable from the parenchyma cells that are more internally located. G-I: Samples of a surface with extensive bark. G: Surface of a trough with bark. H: Stem sample shows extensive bark, a green chlorenchyma about 2-4 mm in thickness, and a yellow-colored chlorenchyma. I: Histological sample shows extensive bark with dark pigmentation. The cuticle is still present. The hypodermis is no longer distinguishable. J-L: Sample has complete bark coverage. J: Surface of a trough with bark. K: Stem sample shows extensive bark, a green chlorenchyma about 2 mm in thickness, and a yellow-colored chlorenchyma L: Extensive bark development with many cellular layers. The cuticle is not present. The hypodermis and chlorenchma are unorganized.
A
B
e
C
h c
D
E
F
G
H
I
J
K
L
*
]
Dombrovskiy
The Manhattan Scientist, Series B, Volume 5 (2018)
81
Discussion The first hypothesis states that there is a 2:1 ratio of bark coverage of south-facing versus north-facing surfaces. For N. tetetzo, the average bark coverage ratios for a Class III and Class IV cacti were 60% for south-facing crest versus 28% for north-facing crest and 92% for south-facing crest to 55% for north-facing crest, respectively. Class IV data of P. hollianus show average bark coverage percentages of 95% for south-facing crests versus 55% for north-facing crests (Table 2). The 2:1 ratio of south to north-facing surfaces was confirmed. Bark formation on surfaces differs between the two cactus species. For N. tetetzo, bark occurs in very small areas, enlarges, and coalesces with other areas. The bark was brown, eventually turning black. In contrast, for P. hollianus, bark occurs in many small areas simultaneously. Coalescence of bark areas was present in all samples. In contrast to N. tetetzo, the bark of P. hollianus has a light-brown color only. Therefore, the surface characteristics of bark are very different between the two species. The second hypothesis states that the anatomical characteristics of the tissues of N. tetetzo and P. hollianus differ. Histological comparisons of N. tetetzo and P. hollianus support this. Neobuxbaumia tetetzo possesses epidermal cells that are more “villi-like,” as well as a thin cuticle layer. In contrast, P. hollianus possesses circular or “half-moon” shaped epidermal cells, with a thick cuticle layer. While the N. tetetzo loses its cuticle layer with initial bark coverage, P. hollianus retains its cuticle throughout extensive bark coverage. Inspection of a P. hollianus cactus with initial bark showed a thickened waxy layer. This layer was light-grey in color. In addition, N. tetetzo possesses a hypodermis with more than two cell layers, whereas P. hollianus possesses a hypodermis with only two cell layers.
Acknowledgments The author is indebted to the Catherine and Robert Fenton Endowed Chair in Biology to Dr. Lance Evans for financial support of this research.
References Anderson, E. (2001). The Cactus Family. Timber Press, Portland, Oregon. Evans, L.S., V. Cantarella, K. Stolte, and K.H. Thompson. 1994. Epidermal browning of saguaro cacti (Carnegiea gigantea): Surface and internal characteristics associated with browning. Environ. Exper. Bot. 34: 9-17. Evans, L.S., M. Cooney. 2015. Sunlight-induced bark formation in long-lived South American columnar cacti. Flora - Morphology, Distribution, Functional Ecology of Plants. 217. 33-40. Evans, L.S., A. Macri. 2008. Stem Surface Injuries of Several Species of Columnar Cacti of Ecuador. J. Torrey Bot. Soc. 135(4): 475-482. Evans, L.S., V. Sahi, and S. Ghersini. 1995. Epidermal browning of saguaro cacti (Carnegiea gigantea): relative health and rates of surficial injuries of a population. Environ. Exper. Bot. 35: 557-562.
82
The Manhattan Scientist, Series B, Volume 5 (2018)
Dombrovskiy
Evans, L.S., J.H. Sullivan, and M. Lim. 2001. Initial effects of UV-B radiation on stem surfaces of Stenocereus thurberi (Organ Pipe cacti). Environ. Ex per. Bot. 46: 181-187. Evans, L.S., A. J.B. Young, and Sr. Joan Harnett. 2005. Changes in scale and bark stem surface injuries and mortality rates of saguaro cactus (Carnegiea gigantea, Cactaceae) population in Tucson Mountain Park. Can. J. Bot. 83: 311-319. Evans, L.S., M. Zugermayer, and A.J.B. Young. 2003. Changes in surface injuries and mortality rates of Saguaro (Carnegiea gigantea) cacti over a twelve-year period. J. Torrey Bot. Soc. 130: 238 243. Geller, G. and P. Nobel. 1984. Cactus ribs: influence of PAR interception and CO2 uptake. Photosynthetica 18: 482-494. Gibson, A.C. and P.S. Nobel. 1986. The cactus primer. Harvard University Press, Cambridge, MA. Jensen, W.A., 1962. Botanical Histochemistry. W.H. Freeman and Co. University of California, Berkeley. Valiente-Banuet, A. and E. Ezcurra. (1991). Shade as a Cause of the Association Between the Cactus Neobuxbaumia Tetetzo and the Nurse Plant Mimosa Luisana in the Tehuacan Valley, Mexico Smith, C.E. 1965. Flora Tehuacán Valley. Fieldiana Botany 31: 107–143. Snedecor, G.W. and W.G. Cochran. 1967. Statistical Methods, Sixth Edition. The Iow State University Press. Ames. UNESCO. Tehuacán-Cuicatlán Valley: originary habitat of Mesoamerica. United Nations Eductional, Scientific and Cultural Organization; http://whc.unesco.org/en/list/1534/
Predicting mortality rates for saguaro cacti (Carnegiea gigantea) Cole Johnson∗ Department of Biology, Manhattan College Abstract. Saguaro cacti (Carnegiea gigantea) are one of more than twenty other columnar cactus species in the Americas subject to bark formation, or epidermal browning. Bark formation is known to be the result of UV-B light coming from the sun. Extensive bark coverage leads to premature death of saguaro cactus plants that are otherwise known to live for hundreds of years. This study used a WEKA machine learning program to predict death progression for a population of saguaro cacti located in Tucson Mountain Park, Tucson, AZ from 1994-2017 over 8-year intervals. It was hypothesized that predicting death for cacti with extensive bark coverage would be more difficult than predicting death of healthy cacti. Each individual cactus was assigned a health class based on percentages of bark coverage on surfaces from 1994-2002, 2002-2010, and 2010-2017. Cacti were evaluated in 1994, 2002, and 2010 to predict mortality in 2002, 2010, and 2017, respectively. Data were analyzed with WEKA decision tree programs to determine the ability of cactus surfaces (north, south, east, and west) to predict cactus mortality. Confusion matrices were obtained to understand the method of prediction. For cacti with bark coverage below 80%, WEKA generated decision trees capable of predicting death accurately. For cacti above 80% bark coverage, WEKA procedures were less accurate. A statistical analysis revealed that cacti that began with more than 80% bark in 1994 and lived in 2017 are different than cacti that began with more than 80% bark and died in 2017.
Introduction The saguaro cactus (Carnegiea gigantea) is a columnar plant species native to the Sonoran Desert of southern Arizona and Sonora, Mexico. This species, along with more than 24 other columnar cacti species, suffers from a coverage of epidermal bark on its stem surfaces resulting in premature death (Evans and De Bonis, 2015). Bark coverage is caused by a build-up of epicuticular waxes on the stem surfaces of the cactus plant (Evans et al., 1994a; 1994b). As the amount of wax on the stem surfaces accumulates, the stomata on the surface of the plant are blocked, therefore preventing gas exchange from occurring (Evans et al., 1994a; 1994b). The inability to exchange gases deprives the cactus of necessary elements needed to carry out photosynthesis and respiration, resulting in premature death (Evans et al. 1994a, 1994b, Evans and Macri, 2008). A previous study shows that bark coverage causes cacti to reach mortality at a rate of 2.3% per year (Evans et al., 2005; 2013). This rate is extremely high given that saguaro cacti have a life expectancy of several hundred years (Steenbergh and Lowe, 1997). Studies have shown the cause of bark coverage to be UV-B light exposure coming from the sun (Evans et al., 2001). Tucson Arizona has a latitude of 32.2◦ N which is above the Tropic of Cancer. Therefore, due to the relative location of the sun, south-facing surfaces of saguaro cacti in Tucson, AZ receive four times more sunlight than north-facing surfaces (Evans and Macri, 2008). It is for this reason that south-facing surfaces have been found to be the first to have bark coverage, whereas north-facing surfaces are last to have bark coverage before cactus death occurs (Evans et al., 1992; 1994a; Figs. 1 and 2). Moreover, previous research exemplifies that bark accumulation ∗
Research mentored by Lance Evans, Ph.D.
Figures.
84
The Manhattan Scientist, Series B, Volume 5 (2018) Figure 1.
Johnson
Figure 2
Figure 1. Pathway of bark coverage typically followed by saguaro cacti (Carnegiea gigantea). The north right troughs are the last surface to have bark coverage.
Figure 2. Visual representation of the bark coverage present on south, east, north, and west-facing surfaces on a typical saguaro cactus (Carnegiea gigantea).
follows a logistic growth pattern on cactus surfaces (De Bonis et al., 2015, Fig. 3). Bark coverage Figure 3.
Figure 3. Logisitc growth curve demonstrating bark accumulation rates typically followed by saguaro cacti (Canergia gigantea). The circled region represents unhealthy, morbid cacti.
Johnson
The Manhattan Scientist, Series B, Volume 5 (2018)
85
rates initially begin slowly for cacti until a relative threshold is met. The cactus will then continue to accumulate bark linearly until it becomes more morbid. At this point, bark accumulation occurs at a slow rate until a cactus reaches death. This study examined the progression of bark coverage to predict death patterns for cacti of various health statuses from 1994-2017 over eight-year time intervals. The following were hypothesized: 1. WEKA decision trees can predict cactus death with bark coverage data of cactus surfaces. 2. WEKA procedures are not accurate for cacti with extensive bark coverage. 3. Bark coverages were different for cacti that remained alive versus cacti that died.
Materials and Methods. Field and survey conditions A sample of 1,149 saguaro cacti (Carnegiea gigantea) were studied over a 23-year period. All cacti were in 50 field plots within Tucson Mountain Park near Tucson, AZ (32.2238â&#x2014;Ś N, 111.1443â&#x2014;Ś W) (Fig. 4) (Evans et al., 2005). Cacti were first selected in 1993. All selected cacti were taller than 4 m and assumed to be more than 80 years old (Steenburgh and Lowe, 1977; Pierson and Turner, 1998). Physical characteristics, nearby vegetation, topographical features, GPS coordinates, as well as human-designated markers were used to distinguish each cactus plant within a plot (Evans et al., 2005). The same cacti have been evaluated in 1993-94, 2002, 2010, and 2017. Cactus surface evaluations During each field evaluation, the percent green area on the surface of each cactus was visually determined and recorded (Evans et al., 2013). The amount of green area on the cactus was taken as opposed to the amount of bark to eliminate bias created by different intensities of barking (Evans et al., 2013). The evaluation took place at an elevation of 1.75 meters from the ground (Evans et al., 2013). A height of 1.75m was used because it is comparable to breast height of the evaluator. This made the visual evaluations more convenient to execute. Also, evaluations occurred at a range of 8 cm in length along each of the surfaces examined (Evans et al., 1994a; 1994b; 1995). The width of the evaluation was dependent on the width of the specific cactus rib. A total of 12 surfaces were evaluated on each cactus (Evans et al., 1995; 2003; 2005). These surfaces consisted of the crests, right troughs, and left troughs on the cactus ribs residing closest to each of the four compass directions on the circle of azimuth (Evans et al., 1994a; 1994b; 1995). Crests are the segment of a cactus rib that protrude outward towards the environment, where troughs are concave indentations located to either side of a crest (Geller and Nobel, 1984; Gibson and Nobel, 1986). The position (right or left) of each trough on either side of the adjacent crest was assigned based on how the evaluator viewed them (Fig. 5). The percentage of green area on each surface was converted to a percentage of bark and placed into a Microsoft Excel workbook. The analysis was done for three-time periods (1994-2002, 2002-2010, 2010-2017).
86
Figure 5
The Manhattan Scientist, Series B, Volume 5 (2018)
Johnson
Figure 4. Landscape view of Tucson Mountain Park, Tucson, Arizona. Figure 6
Left Trough
Crest
Right Trough
Figure 5. Anatomy of the rib of a saguaro cactus (Carnegiea gigantea).
Figure 6. Demonstration of the varying degrees of bark coverage on individual saguaro cactus (Carnegiea gigantea) plants. The photo on the left represents a healthy cactus with limited bark coverage. The photo on the right represents an unhealthy cactus with heavy bark coverage.
Health classes of cacti Cacti vary in the amount of bark coverage they possess, relative to one another (Fig. 6). Each cactus was assigned a health class corresponding to the formation of bark on its south-facing crest only (Table 1) (Evans et al., 2013). An initial health class was assigned at the beginning of the evaluation period. A second health class was assigned upon completion of the evaluation denoting whether the cactus had survived or reached mortality.
Johnson
The Manhattan Scientist, Series B, Volume 5 (2018)
87
Table 1. Classes of saguaro cacti (Carnegiea gigantea) of this study. The bark percentage coverage was determined for south crests only. Class I II III IV V
Criteria Less than 20% bark at beginning of evaluation period Between 20% and 49% bark at beginning of evaluation period Between 50% and 80% bark at beginning of evaluation period Greater than 80% bark at beginning of evaluation period Dead
Theory behind WEKA 3.8 To unveil which surfaces were prominent in predicting mortality, a Machine Learning Program known as WEKA 3.8 was used. WEKA uses criteria known as classifiers to predict certain outcomes from one measurement period to the next (Hall et al. 2009). For our purposes, WEKA generated these predictions in the form of J48 decision trees. This format of decision tree is produced through the implementation of the ID3 algorithm and allows one to predict target values within a set of data (White et al., 1941; Hitchcock, 1971). Each of the 12 surfaces acted as the examined variables. The decision trees produced by WEKA reveal which surfaces play crucial roles in predicting cactus mortality within a given health category of cacti. The decision trees produced can be easily interpreted by those who are unfamiliar with Machine Learning programs, along with the methods they utilize (Hyafil and Rivest, 1976). The accuracies of the trees were determined using the corresponding confusion matrices. Use of WEKA Cacti were separated into distinctive Microsoft Excel files, based strictly upon the health classes they occupied at the beginning of each time interval. Cacti belonging to each of the classes could either remain alive or reach death upon completion of the evaluation. The fate of each cactus at the end of each 8-year evaluation was noted in the file, in a format recognizable by WEKA. WEKA used the data from each of the separate files to generate decision trees that predict the pathway taken to mortality in each of the health classes. The trees produced used the 12 surfaces to predict cactus death at the end of each 8-year period. The different trees from each of the classes expose which surfaces play a prominent role in making such predictions as well as which subsets can be predicted accurately. Analysis of class IV cacti Each cactus that was initially grouped in Class IV (more than 80% bark on the south crest) in 1994 was isolated from the data set and examined. These cacti were then further divided into two categories: those remaining alive in 2017 and those that reached death in 2017. A statistical analysis in the form of a T-Test was then performed to compare bark values between these two sets of cacti.
88
The Manhattan Scientist, Series B, Volume 5 (2018)
Johnson
Results WEKA decision trees can predict cactus death with bark coverage data of cactus surfaces The goal of the first section of this study was to produce a decision tree model capable of predicting cactus death at high accuracies using WEKA programs. WEKA decision trees predicted death of Class I cacti from 1994-2002, 2002-2010, and 2010-2017 with accuracies of 89.2%, 83.5%, and 97.1%, respectively (Table 2; Fig. 7). Confusion matrices were also generated. For Class I cacti, there were more cacti predicted alive that were actually dead compared to those that were predicted dead that were actually alive. Decision trees predicting death for Class II cacti had accuracies between 64.9% and 95.2% (Fig. 8). The lower accuracies were coincident with the number of cacti that were predicted to be alive that were actually dead. In a similar manner, decision trees for Class III cacti had accuracies between 57.1% and 92.5%. The low accuracy of 57.1% was for a group of cacti in which there were only 14 samples. In contrast, decision trees for Class IV cacti had lower accuracies (68.9% - 77.4%). Table 2. with various predictive surfaces over each time interval. For simplicity, only the first 5 steps of the decision trees are shown for the predictor surfaces. Bolded values represent the number of cacti that were correctly predicted by the decision trees. Values that are not bolded represent the number of cacti that were incorrectly predicted by the decision trees. Class predicted
Time interval
I
1994-2002
Predictor surfaces (First 5)
Predicted: Actual:
Confusion Matrices Alive Alive Dead Dead Alive Dead Dead Alive
number of cacti
Accuracy (%)
NR 3, SC 7, NC 3, WR 3, ER 2 SL 5,SL 10, WL 5 NO TREE
203 71 34
22 11 1
3 0 0
3 3 0
231 85 35
89.2 83.5 97.1
2002-2010 2010-2017
NC 35, NC 95, EL 10 SC 35 NO TREE
94 23 80
12 9 4
0 0 0
5 4 0
111 37 84
84.7 64.9 95.2
III
1994-2002 2002-2010 2010-2017
NO TREE SR 10, EC 40 NR 22, SL 40
70 8 99
11 3 4
0 0 0
0 3 4
81 14 107
86.4 57.1 94.5
IV
1994-2002
WL 75, NR 55, EL 95, SL 99, NL 35 NL 50, SR 80, WC 96, ER 95, NC 55, NR 30 SL 95, NL 45, NC 85, WL 45, NC 65,
284
113
35
31
463
68.9
272
67
74
34
447
77.4
279
70
37
49
425
72.6
2002-2010 2010-2017 II
1994-2002
2002-2010 2010-2017
Johnson
Figure 7
The Manhattan Scientist, Series B, Volume 5 (2018)
Figure 8
Figure 7. WEKA decision tree predicting death among Class I cacti from 1994-2002.
Figure 8. WEKA decision tree predicting death among Class II cacti from 1994-2002
89
90
The Manhattan Scientist, Series B, Volume 5 (2018)
Johnson
WEKA procedures are less accurate for cacti with extensive bark coverage The relative accuracies to which WEKA is capable of predicting death for the various classes, as previously described and shown in Table 2, are generally greater for Classes I, II, and III when compared to Class IV. Moreover, the decision trees resulting from this study were generally very complex in comparison to others generated. The number of steps it took for a decision tree to predict death for a class IV cactus was much larger in comparison to the number of steps needed to predict death among other classes of cacti. While the decision tree predicting death of Class IV cacti from 1994-2002 consisted of only 5 total steps, the decision tree predicting death for Class IV cacti from 2002-2010 required 32 steps while the tree predicting death for Class IV cacti from 2010-2017 required 24 steps for a decision to be reached. Decision trees applying to lower health classes required a range of only 1-6 steps. Table 3. Mean data of cacti that began as Class IV in 1994 and either remained alive into 2017 or reached mortality in 2017. A T-Test was performed on raw data values of cacti that remained alive in 2017 and cacti that died in 2017 for each evaluation period. Bolded values marked by no asterisk are statistically similar. Values marked with one asterisk were significantly different by a T-Test value between 0.01 and 0.05. Values marked with two asterisks were significantly different by a T-Test value of less than 0.01. 2017 Status
SC
SR
SL
EC
ER
EL
30.7* 42.0*
50.1** 68.2**
NC
NR
NL
WC
WR
WL
41.4** 10.4 57.0** 15.4
18.4 26.2
57.2** 34.7* 70.0** 47.2*
14.4 16.8
27.1** 74.9 41.4** 85.2
from 1994 Alive Dead
95.8** 47.2** 49.3** 71.7* 98.4** 67.5** 65.5** 82.0*
from 2002 Alive Dead
98.3 98.6
68.5** 70.7** 82.0** 42.0** 67.2** 83.9** 86.2** 91.8** 68.7** 83.4**
55.6** 19.1* 73.9** 27.4*
from 2010 Alive Dead
99.1 98.6
81.4** 85.4* 92.9** 92.7*
75.0** 33.3** 38.5** 92.0 91.2** 49.7** 56.7** 94.4
89.4** 62.7** 79.4** 96.9** 78.6** 90.9**
55.9* 66.8*
30.1 37.2
71.8** 54.1* 84.7** 63.8*
Bark coverages were different for cacti that remained alive versus cacti that died Since the ability for WEKA decision trees to accurately predict death of Class IV cacti was low, analyses were performed to analyze bark coverages of Class IV cacti in 1994, that were alive and dead in 2017. Class IV cacti that died between 2010 and 2017 had higher bark coverages for all surfaces in 1994 than for cacti that remained alive in 2017, with the exception of south crests in 2010 (Table 3). Cacti that began in Class IV in 1994 and died between 2010 and 2017 were statistically different from cacti remaining alive in 2017, with the exception of north-right troughs, north-left troughs, and west-left troughs in 1994. Cacti that were initially grouped in Class IV in 1994 that died between 2010 and 2017 were statistically different from cacti that remained alive in 2017, with the exception of south crests and west-left troughs in 2002. Cacti that began in Class IV in 1994 that died between 2010 and 2017 were statistically different from cacti that remained alive in 2017 except for south crests and west crests in 2010.
Johnson
The Manhattan Scientist, Series B, Volume 5 (2018)
91
Discussion The purpose of this study was to closely examine bark coverage and cactus death within a population of saguaro cacti. Bark coverage on saguaro cacti was a rare phenomenon prior to the 1950s. Before this time, saguaros experienced limited barking, and reached very large heights on the growth spectrum. Recently, saguaro cacti and other columnar cacti species have been experiencing bark coverage at rapid rates. Bark coverage is a precursor to death, and it can provide great insight as to when a cactus will reach mortality. As previously stated, bark coverage on cactus plants typically follows a certain pattern. South-facing surfaces are the first to exhibit epidermal barking, whereas north-facing surfaces are normally last (De Bonis et al., 2017). Bark formation also follows a logistic growth pattern, with slower rates of bark formation occurring in unhealthy cacti (Evans and De Bonis, 2015). For these reasons, the hypotheses generated considered the progression of bark coverage on saguaro cactus plants as a basis of predicting death. The initial hypothesis of this study stated that WEKA decision trees can predict cactus death with bark coverage data of cactus surfaces. WEKA was able to formulate at least two decision trees predicting death for cacti in each health class for each of the evaluation periods. WEKA was also able to formulate a projected pathway to death taken by a cactus in a certain health class at a high accuracy. Therefore, the decision-tree models can, in fact, be trusted to make accurate decisions while predicting cactus death. Therefore, this hypothesis can be regarded as true. Instances in which WEKA was unable to generate a tree, or did so with low accuracy within classes I, II, and III, are likely not due to random chance and can be explained through reasoning. The algorithm used by WEKA to generate decision trees implements a technique known as crossvalidation by ten-fold. This technique uses 90% of cacti in a given data set to train WEKA into producing a decision tree while using the remaining 10% of cacti to test the accuracy of the tree. Therefore, if a specific subset does not contain a large number of cacti, it is likely that WEKA will not have enough information to generate a decision tree adequately. For example, WEKA was not able to produce a decision tree predicting death for Class I cacti from 2010-2017, because only 35 cacti were available for examination in this subset. Moreover, only one (1) cactus died during this time period, whereas 34 remained alive. Therefore, the proportion of alive to dead cacti that WEKA can use to train and validate a decision tree model for this specific subset of cacti is extremely uneven, thus making WEKA incapable of generating a tree. Other instances, such as this one within the decision trees corresponding to classes I, II, and III, can be explained utilizing the same reasoning. Furthermore, it was found that predicting death of unhealthy, Class IV cacti using WEKA programming is fairly unsuccessful in comparison to other classes, thus validating the second hypothesis. Not only were the trees corresponding to Class IV cacti less accurate, but they also were much more complex in structure. Decision trees generated for Class IV cacti were much larger and more complex in comparison to those generated for other health classes. This indicates that the pathway taken to death by Class V cacti is extremely complex and inconsistent. These data
92
The Manhattan Scientist, Series B, Volume 5 (2018)
Johnson
suggest a complex underlying survival mechanism implemented by an unhealthy cactus to remain alive for as long as possible. Additional results of this study suggest ways in which WEKA could be used to predict death of Class IV cacti more successfully in the future. The results pertaining to the third hypothesis show that there is a statistical difference between cacti that began as Class IV in 1994 and either remained alive or died by 2017. Therefore, it may be plausible to predict death of Class IV cacti over longer time periods as opposed to over 8-year intervals. Predicting death of Class IV cacti from 1994-2017 using WEKA procedures may prove to be more successful than doing so from 1994-2002, 2002-2010, or 2010-2017. This theory can be further supported by the fact that the greatest difference in average bark coverage between cacti that began as Class IV in 1994 and died by 2017, and those that remained alive in 2017, occurs in 1994 as demonstrated in Table 3. As time progresses into 2002, and additionally 2010, the difference in average bark coverage between the cacti that lived and the ones that died diminishes. In addition to further examining methods of predicting death of Class IV cacti, additional research may also include a closer examination of bark accumulation rates to see if the population is in a steady-state equilibrium with regard to bark accumulation. This examination could potentially reveal supplementary useful information in explaining the complexity of Class IV cacti death progression, as well as other health classes of cacti. Both the results of this study, as well as previous ones, could potentially provide substantial evidence in unveiling factors that influence the rate of bark coverage and mortality for individual cactus plants. The saguaro cactus was known to live for several hundreds of years (Steenbergh and Lowe, 1977); however, recent studies show that the expected lifespan of these organisms has decreased substantially since bark formation became more prominent in the 1950s (Oâ&#x20AC;&#x2122;Brien et al., 2011). The saguaro cactus and other plant species subject to premature death caused by epidermal barking are delicacies to several groups of people across North America. If factors directly related to influencing the rate of bark coverage can be determined, then preventative action may be taken to increase the longevity of these once prosperous plants.
Acknowledgments This work was supported by the Catherine and Robert Fenton Endowed Chair in Biology to Dr. Lance Evans. The author is grateful to Dr. Lance Evans for his mentorship and extensive help with this study.
References DeBonis, M., L. Barton, L.S. Evans. 2017. Rates of bark coverage of surfaces of saguaro cacti (Carnegiea gigantea). J. Torrey Bot. Soc. 131: 236-248. Evans, L.S., P. Boothe and A. Baez. 2013. Predicting morbidity and mortality for a saguaro cactus (Carnegiea gigantea) population. J. Torrey Bot. Soc. 140: 247-255.
Johnson
The Manhattan Scientist, Series B, Volume 5 (2018)
93
Evans, L.S., V.A. Cantarella, L. Kaszczak, S. M. Krempasky, and K. H. Thompson. 1994b. Epidermal browning of saguaro cacti (Carnegiea gigantea). Physiological effects, rates of browning and relation to sun/shade conditions. Environ. Exp. Bot. 34: 107-115. Evans, L.S., V.A. Cantarella, K.W. Stolte and K.H. Thompson. 1994a. Phenological changes associated with epidermal browning of saguaro cacti at Saguaro National Monument. Environ. Exp. Bot. 34: 9-17. Evans, L.S., and M. DeBonis. 2015. Predicting morbidity and mortality of Saguaro cacti (Carnegiea gigantea) J. Torrey Bot. Soc. 142: 231-239. Evans, L.S., K.A. Howard and E. Stolze. 1992. Epidermal Browning of Saguaro Cacti (Carnegiea gigantea): Is it new or related to direction? Environ. Exp. Bot. 32: 357-363. Evans, L.S. and A. Macri. 2008. Stem surface injuries of several species of columnar cacti of Ecuador. J. Torrey Bot. Soc. 135: 475-482. Evans, L.S., V. Sahi, and S. Ghersini. 1995. Epidermal browning of saguaro cacti (Carnegiea gigantea): Relative health and rates of surficial injuries of a population. Environ. Exp. Bot. 35: 557-562. Evans, L.S., J. Sullivan and M. Lim. 2001. Initial effects of UV-B radiation on stem surfaces of Stenocereus thurberi (organ pipe cacti). Environ. Exp. Bot. 46: 181-187. Evans, L.S., A.J. Young, and Sr. J. Harnett. 2005. Changes in scale and bark stem surface injuries and mortality rates of saguaro (Carnegiea gigantea) cacti population in Tucson Mountain Park. Can. J. Bot. 83: 311-319. Evans, L.S., M. Zugermayr, and A. J. Young. 2003. Changes in surface and mortality rates of saguaro (Carnegiea gigantea) cacti over a twelve-year paper. J. Torrey Bot. Soc. 130: 238-243. Geller, G. and P. Nobel. 1984. Cactus ribs: Influence of PAR inception and CO2 uptake. Photosynthetica 18: 482-494. Gibson, A. and P. Nobel. 1986. The Cactus Primer. Harvard University Press, Cambridge, MA. 286 p. Gibson, A. and P. Nobel. 1986. The Cactus Primer. Harvard University Press, Cambridge, MA. 286 p. Hall, M., E. Frank, G. Holmes, B. Pfrahringer, P. Reutemann, and I. Written. 2009. The WEKA Data Mining Software: An Update, SIGKEE Explorations, Volume 11, Issue 1. Hitchcock, A. 1971. Manual of the Grasses of the United States. Dover Publishers, New York, NY. 1051 p. Hyafil, L. and R. Rivest. 1976. Constructing optimal binary decision trees in NP-complete. Inform. Process. Lett. 5: 15-17. Oâ&#x20AC;&#x2122;Brien, K., D. Swann, and A. Springer. 2011. Results of the 2010 saguaro consensus at Saguaro National Park. National Park Service, U.S. Department of Interior, Tucson, AZ. 49 p. Pierson, E. and R. Turner. 1998. An 85-year study of saguaro (Carnegiea gigantea) demography. Ecology 79: 2676-2693.
94
The Manhattan Scientist, Series B, Volume 5 (2018)
Johnson
Steenbergh, W.F. and C.H. Lowe. 1977. Ecology of the Saguaro II reproduction, germination, establishment, growth, and survival of the young plant. National Park Service Monograph Series Eight. White, A., R.A. Dyer, and B.L. Sloane. 1941. The Succulent Euphorbieae (Southern Africa). Abbey Garden Press, Pasadena, CA. 937 p.
Predicting rates of bark coverage on saguaro cacti (Carnegiea gigantea) George Kennedy∗ Department of Mechanical Engineering, Manhattan College Abstract. In the Sonoran Desert of Arizona, the Saguaro cactus (Carnegiea gigantea) is an abundant species that exhibits bark coverage on stem surfaces that eventually lead to cactus death. Bark accumulation is caused by the harmful rays of the sun. Bark begins growing on south-facing surfaces and eventually accumulates on east-facing and west-facing and ultimately on north-facing surfaces. Once the north surfaces are 80% covered, a cactus has a 95% probability of death within eight years. The Classification Learner App in MATLAB used supervised machine learning to train models and classify data. The data base consisted of twelve surfaces on each of 1149 cacti for 1994, 2002, 2010, and 2017. The outputs of this application, as well as its accompanying codes, included: confusion matrices, scatter plots with model predictions, decision trees, and predictions of bark coverage/cactus death. All predictions to date were above 90.5%. Overall, the Classification Learner App used more than 55,000 data points to predict bark coverages and cactus death accurately. The combination of the four procedures described herein may provide excellent predictive abilities for many biomedical applications such as patterns of cell proliferation for hard tumors and the spread of affected persons during widespread epidemics.
Introduction Epidermal browning or bark coverage occurs on saguaro (Carnegiea gigantea (Engelm.) Britt and Rose) cacti (Evans et al., 1994a; 1994b; 1995) native to Tucson, Arizona due to sunlight exposure. In Fig. 1 it is evident that the cacti are exposed to high levels of sunlight from all angles throughout the day. Fig. 2 displays bark coverage on a given trough of a saguaro cactus. For saguaro cacti, bark coverage begins on the south-facing surfaces but eventually covers all stem surfaces. North-facing surfaces show bark about 10 to 15 years after south-facing surfaces (DeBonis et al., 2017). South-facing surfaces receive about four times more sunlight on an annual basis compared with north-facing surfaces. Bark coverage on cactus plants is not common (Anderson, 2001). Past publications indicate that saguaro cacti normally live for 200 to 300 years (Steenbergh and Lowe, 1977). However, more recent data show that saguaro cacti near Tucson AZ have life expectancies of less than 120 years (O’Brien et al., 2011). Increased overall mortality of the cacti population is accounted for by sunlight. Based on previous analysis of UV-B exposure on natural stands, it is conclusive that UV-B is also the cause of the epidermal browning symptoms observed in cacti plants. For a saguaro cactus to fulfill its expected life time of 200 to 300 years (Steenbergh and Lowe, 1977), it must have functioning stomata for gas exchange. The stomata are located on cactus surfaces that consist of a thick cuticle, an epidermis, and several hypodermal cell layers. Bark ac∗
Research mentored by Lance Evans, Ph.D.
96
The Manhattan Scientist, Series B, Volume 5 (2018)
Kennedy
Figure 1. Landscape view of Tucson Mountain Park, Tucson, Arizona.
Figure 2. Depiction of bark coverage on the north right trough of saguaro cactus (Canergia gigantean)
Figure 3. Geometric representation of the 12 sides of each cactus. The north right trough is highlighted as it is used for classification
cumulation is essentially the buildup of epicuticle waxes followed by proliferation of epidermal cells to form a scale and bark (Evans et al., 1994a; 1994b; Evans and Macri, 2008). The bark inhibits the functioning of the stomata on the surfaces, and inevitably the functioning of gas exchange. Gas exchange includes photosynthesis and respiration, which is vital to cactus life (Evans and Macri, 2008). This is a study of interest because saguaro cacti are dying before their expected 200-year lifetime (Evans et al., 1995). The experiments of this study span from 1993 to 2017 to determine factors and influences that are contributing to evident premature mortality of saguaros. The specific
Kennedy
The Manhattan Scientist, Series B, Volume 5 (2018)
97
hypothesis of this study was to identify patterns in bark progression over 8-year intervals to isolate prominent predictive surfaces involved in bark progression. It was hypothesized that WEKA 3.8 programming is capable of isolating the major predictive surface of bark accumulation on the north right-facing trough. The purpose of the current research was to determine if bark coverages on a variety of cactus surfaces can predict bark coverage on north-facing right troughs. Fig. 3 is a depiction of the 12 sides of the saguaro cactus, and the north right face is circled as it will be used in classifications. The north right trough is typically the last surface to accumulate bark due to the sunâ&#x20AC;&#x2122;s path. The classification method used in this experiment was based on the bark accumulation percentage of each north right trough in the dataset. In order to classify this data, the Classification Learner Application in MATLAB was utilized. This application is capable of creating trained models that can be used to classify large collections of data. Once a model is trained, the process can be repeated with different sets of data. This application is also capable of making classification predictions for cacti in future years.
Materials and Methods Field evaluations Saguaro cacti (Carnegiea gigantea (Engelm.)) were studied in Tucson Mountain Park (32.2â&#x2014;Ś N, 111.1â&#x2014;Ś W). In 1994, 50 permanent plots with 1149 cacti were randomly selected. The cacti were evaluated in 1994, 2002, 2010, and 2017. The selected cacti were all taller than 4 meters. Cactus morphological features of cacti, characteristics of nearby vegetation, topographical features, and GPS data were used to identify cacti for each field evaluation. Cactus surface evaluations Saguaro cacti have vertical ribs with crests (protrusion) that are separated by convex troughs (Geller and Nobel 1984, Gibson and Nobel, 1986). Once each cactus was re-located during each field re-evaluation, crests closest to south, east, north, and west directions were evaluated. In addition, right and left troughs of crest were re-evaluated (Evans et al., 1994a; 1994b; 1995). For each crest and trough, a surface 8 cm long was evaluated at 1.75 meters from the ground (Evans et al., 1994a; 1994b; 1995). The percent bark coverage for these surfaces was estimated visually. Past data showed that visual estimates of bark coverage were similar to estimates from digital methods (Evans and De Bonis, 2015). As a result, the initial excel data file contained the bark coverage percentages for crests and the two troughs, each for south, east, north and west-facing crests. Thus, each cactus had data of bark coverage on twelve surfaces for analysis. Computer model development The first step in the process was to place each cactus into a bark coverage class based upon the percentage of bark coverage on surfaces. For diverse populations, entire populations cannot be analyzed, so cacti were placed in bark coverage classes. For this analysis, classes were based upon bark coverage percentages on north-facing right troughs. This basis is used because bark coverage occurs last on north-facing north troughs among the surfaces analyzed (DeBonis et al., 2017).
98
The Manhattan Scientist, Series B, Volume 5 (2018)
Kennedy
For the scenario of north-right troughs, at the start of the period of study the classes were as follows: Class 1 had bark coverage less than 20% on north-right troughs . Class 2 had bark coverage between 21 and 40% on north-right troughs. Class 3 had bark coverage between 41 and 60% on north-right troughs. Class 4 had bark coverage between 61 and 80% on north-right troughs. Class 5 had more than 80% bark coverage between 21 and 40% on north-right troughs. To process and train data, the Classification App requires input data that consists of predictors and response classes. MATLAB codes were used to separate the cacti by year. These codes ultimately generated a matrix that could be used as input data for the Classification App. New files (classified files) that are created (for each year) have cactus class numbers and designate whether a cactus is dead or alive. Coded files are constructed using “if” and “elseif” statements applied to each row of data (one row for each cactus). The code uses a loop function to accomplish the entire process, one cactus (row of data) at a time. The process was used to create many classified files (e.g. Class 1 for 1994). As the outputs shown used the north right to predict itself, its accuracy is very high. Many different predictor surfaces were used with varying accuracies throughout this experiment. Data analysis - classification learner app Each excel file was imported into the classification learner app. For each imported classified file, “predictor surfaces” were designated and varied throughout the data analysis. There can be one or many predictor surfaces for each session. During the session, the application trains itself with each row of data and individual “i” values to produce a final decision tree using the “All Trees” alternative. Run decision tree code was then run in order to display each individual decision tree. The program can also separate the data into individual time periods to determine year to year changes in bark coverage on the predicted surface. Many individual time periods can be selected. The application displayed a scatter plot with correct and incorrect model predictions. The confusion matrix can show the rates of bark coverage. Data outputs: Decision tree, scatter plot, model predictions from scatter plot, confusion matrix Classes 1 or 2. This was the first classification method used to classified cacti into either class 1 or 2 depending on bark coverage. Their classes were decided by the bark coverage of (in this case) the north-right troughs. The excel file used contained cacti that either went from class 1 to 2 or remained in class 1. After selecting the desired predictors and responses, the classification app generated a scatter plot for the model you are training. The scatter plot shows both the input data and the model predictions done by the classification app. The predictors can also be changed in order to look at different relationships between sides and bark coverages.
Kennedy
The Manhattan Scientist, Series B, Volume 5 (2018)
99
Table 1 is a confusion matrix that predicts whether a cactus is in class 1 or 2 with an accuracy of 90.5%. The confusion matrix shows the true class vs the class predicted by the application and model. In many cases, but specifically this one as it focuses on classes 1 to 2, there are a far larger amount of class 1 cacti than 2-5. This increases the accuracy of the model. Table 1. Class 1-2 Confusion Matrix of scatterplot data in Fig.3, predicting class one or two with accuracy of 90.5% Predicted Class Class 1 Class 2 True Class
Class 1 Class 2
453 36
12 4
Fig. 4 is the decision tree that accompanies Figs. 1 and 2. The decision tree is produced from a simple two-line code that can also make predictions for future data models. The decision tree uses the given predictors to create criterion that will decide necessary bark coverages for each class. The responses can be changed based on what is included in the input data. South Right
SR ≥ 32.5
SR < 32.5
North Left
Class 1
NL ≥ 13.5
NL<13.5 Class 1
North Crest
NC < 85 Class 1
NC ≥ 85 Class 2
Figure 4. Decision tree for cacti in Classes 1 and 2
Classes 1-5: The data being input into the classification app for these outputs were an excel file with bark coverage percentages. This data were separated by year into four sheets for 1994, 2002, 2010, and 2017. A scatter plot (Fig. 5) of data was created by the application to predict classes 1 through 5 based upon data from 1994 with an accuracy of 99.6%. After selecting the desired predictors and responses, the classification app generated a scatter plot for the model being trained. The scatter plot showed the model predictions done by the classification app. The predictors can also be changed in order to look at different relationships between sides and bark coverages. Each data point is representative of a cactus and they are given colors based on their response classes. Data
100
The Manhattan Scientist, Series B, Volume 5 (2018)
Kennedy
points are crossed out when they are predicted incorrectly. This figure also shows the accuracy of the current classification. The responses in this case are classes 1-5. As this decision tree used the north right to predict itself, its accuracy is very high. Many different predictor surfaces were used with varying accuracies.
Figure 5. Scatter plot of data to predict Classes 1 through 5 based upon data from 1994 with an accuracy of 99.6%
1
North Right
NR ≥ 20
NR < 20
North Right
Class 1
NR ≥ 40
NR < 40
North Right
Class 2
NC ≥ 60
NR < 60 Class 3
North Right
NR < 80 Class 4
NR ≥ 80 Class 5
Figure 6. Decision tree that accompanies Fig. 5
Table 2 is the confusion matrix to predict classes 1-5 with an accuracy of 99.6%. As there are now five response classes instead of two, the model now predicts which class out of the 5 each data point will fall into. As there is a substantial amount of Class 1 cacti in 1994, this models accuracy
Kennedy
The Manhattan Scientist, Series B, Volume 5 (2018)
101
is very high. Table 2. Confusion Matrix of scatterplot data in Fig. 6, predicting Classes 1 through 5 with accuracy of 99.6%
Class 1
True Class
Class 1 Class 2 Class 3 Class 4 Class 5
449 0 0 0 0
Predicted Class Class 2 Class 3 Class 4 0 47 0 0 0
0 0 20 1 0
0 0 0 13 1
Class 5 0 0 0 0 18
Fig. 6 is the decision tree that accompanies Table 2 and Fig. 5. This decision tree was the basis for the many different trees produced with varying predictor surfaces. As the predictor surfaces were changed, the accuracy of the model predictions varied as well.
Discussion and Results The purpose of this study was to analyze the bark coverage on a large quantity of saguaro cacti and their subsequent deaths. Bark accumulation was a rare occurrence before the 1950s, however many Saguaro in 2017 are either near death or already dead from full bark coverage. The currently rapid rates of bark growth are harmful to any species of cactus that is exposed to high levels of UV-B from sunlight. Bark coverage also follows a common path when spreading throughout the troughs and crests of a cactus. Starting with south-facing surfaces, the bark eventually spreads through east and west-surfaces finally reaching north-facing surfaces (DeBonis et al., 2017). The outputs of this application, as well as its accompanying codes, included: confusion matrices, scatter plots with model predictions, decision trees, and predictions of bark coverage/cactus death. Each of these outputs had an accuracy of over 90% when making predictions. In order to utilize MATLABâ&#x20AC;&#x2122;s Classification Learner App, input data must be formatted accordingly. Predictors and responses must be clearly indicated on the input data, and enough data must be present in order to generate an accurately trained model. The first input data consisted of cacti that either remained in Class 1 or moved to Class 2 from 1994 to 2002. The Classification Learner App, as well as accompanying codes produced confusion matrices, scatter plots with model predictions, decision trees, and predictions of bark coverage/cactus death. Each of these outputs had an accuracy of over 90% when making predictions. The application evaluates the North Right surface of each cactus and separates them by class depending on their bark coverage and the classification method being used. The application then uses its model predictions to output a confusion matrix based on the decision tree model that was trained.
Conclusion The relatively recent trend in bark accumulation in saguaro cacti is very harmful to their functionality and detrimental to their lifespan. As the harmful rays of the sun will continue to cause
102
The Manhattan Scientist, Series B, Volume 5 (2018)
Kennedy
bark accumulation, it is beneficial to know as much information about future cacti as possible. The ability to accurately determine and predict response classes from “big data” will play a significant role in analyzing the mortality of the saguaro cacti. Using the various decision trees and confusion matrices that were outputted from the Classification Learner application, it was easier to single out outliers and anomalies in our dataset. Finding trends in the progressions of “big data” also has many applications outside of the saguaro cactus. These methods of machine learning with the given dataset of bark coverages made it possible to classify, organize, and make predictions about the status of cacti in the future. Using these methods of machine learning with large data sets that contain predictors and response classes can be applied to a variety of topics, including the spread of cancer cells and disease.
Acknowledgements This work was supported by the Robert and Catherine Fenton Endowed Chair in Biology to Dr. Lance Evans. The author is grateful to Dr. Ehsan Atefi and Dr. Lance Evans, as well as the mechanical engineering and biology departments for this opportunity.
References Anderson, E. (2001). The Cactus Family. Timber Press, Portland, Oregon. DeBonis, M. L. Barton and L. S. Evans 2017. Rates of bark formation on surfaces of saguaro cacti (Carnegiea gigantea) J. Torrey Bot. Soc. 144: 450–458. Evans, L. S. and M. DeBonis. 2015. Predicting morbidity and mortality of saguaro Cacti (Carnegiea gigantea). J. Torrey Bot. Soc. 142: 231–239. Evans, L. S., V. A. Cantarella, K. W. Stolte, and K. H. Thomson. 1994a. Phenological changes associated with epidermal browning of saguaro cacti at Saguaro National Monument. Environ. Exp. Bot. 34: 9–17. Evans, L. S., V. A. Cantarella, L. Kaszczak, S. M. Krempasky, and K. H. Thomson. 1994b. Epidermal browning of saguaro cacti (Carnegiea gigantea) . Physiological effects, rates of browning and relation to sun/shade conditions. Environ. Exp. Bot. 34: 107– 115. Evans, L. S. and A. Macri. 2008. Stem surface injuries of several species of columnar cacti of Ecuador. J. Torrey Bot. Soc. 135: 475-482. Evans, L. S., V. Sahi, and S. Ghersini. 1995. Epidermal browning of saguaro cacti (Carnegiea gigantea) : Relative health and rates of surficial injuries of a population. Environ. Exp. Bot. 35: 557–562. O’Brien, K., D. Swann, and A. Springer. 2011. Results of the 2010 saguaro census at Saguaro National Park. National Park Service. U.S. Department of Interior. Tucson, AZ. Steenbergh, W.F. and C.H. Lowe. 1977. Ecology of the Saguaro II reproduction, germination, establishment, growth, and survival of the young plant. National Park Service Monograph Series Eight.
Scaling relationships between leaf areas and leaf veins for percurrent leaves Gina Leoncavalloâ&#x2C6;&#x2014; Laboratory of Plant Morphogenesis, Department of Biology, Manhattan College Abstract. This study had two main components: (a) to determine the relationships between entire leaf areas and interveinal areas, and the relationships between vein lengths, and (b) to examine how these properties change as a leaf enlarges. For the first part of the study, 23 distinct plant species were examined. For the enlargement aspect of the study, we focused on just 5 species. One criterion that was required in both of the components was that the leaves must be percurrent, meaning that there is one primary vein, with secondary veins extending from that primary vein, tertiary veins extending from one secondary to the next, and quaternary veins filling in the tertiary veins. All 23 species had this percurrent leaf venation pattern. In terms of leaf areas, the entire area of the leaf was called the leaf area, the area between two secondary veins was called a secondary area, and the area between two tertiary veins was called the tertiary area. In this study, we were able to determine that while leaf interveinal areas enlarge as the leaf enlarges, the actual number of areas does not increase. In other words, the size increases but the number does not. We also learned that larger leaves have larger primary, secondary, and tertiary areas.
Introduction Leaves provide the remainder of a plant with carbohydrates. When stomata are open, carbon is taken up and water is lost via transpiration. Therefore, as leaves continuously lose water, leaves must obtain water from stems. Within each leaf, water must be effectively moved to all parts of a leaf. In plants, water transport only occurs in vessel (xylem) cells in primary, secondary and tertiary veins. Therefore, water enters leaves from petioles into primary veins and then on to the smaller secondary and tertiary veins. One expects that larger veins will have more vessels and larger vessels than the fewer and fewer vessels and smaller and smaller vessels in the smaller veins. The purpose of this study is to understand relationships between veins characteristics and the leaf areas that they provide water to. Is there a scaling of vein lengths and the areas of leaves that are fed by the veins? A method to test the scaling of the three types of veins with leaf areas is to measure vein characteristics and leaf areas as individual leaves enlarge. Twenty three (23) species of plants with percurrent leaves were tested. Leaves are specialized organs that allow plants to capture light and use it to create carbohydrates and O2 during photosynthesis (Lambers et al., 1998). Due to this ability, plants are the primary producers on Earth and are responsible for the majority of the available O2 in the environment. The light is captured by chloroplasts that contain chlorophyll and the site where photosynthesis takes place. Carbon dioxide enters leaves via stomata. Stomata are openings between guard cells that open and close in response to the amount of available water. When there is little water to uptake, the stomata close in order to preserve water and to maximize water-use efficiency. Water-use efficiency is defined as the carbon gain per water lost (Hopkins et al., 2009). â&#x2C6;&#x2014;
Research mentored by Lance Evans, Ph.D.
104
The Manhattan Scientist, Series B, Volume 5 (2018)
Leoncavallo
Availability of water is the most restricting factor that prevents plants from being able to grow. 70-95% of the biomass of leaves and roots is water, so insufficient amounts of water can be extremely detrimental to the health of a plant, if not fatal. When plants do not intake enough water, they lose turgor pressure, which is the force that pushes up against cell walls and prevents plant from wilting. When plants experience extended periods of drought, they wilt, which will usually lead to plant death. Due to this, understanding plant water relations is crucial in order to better comprehend the natural patterns of productivity. With this knowledge, agriculture and forestry productivity can be maximized (Lambers et al., 1998). This study focused on those plant species that exhibited a particular leaf venation pattern. Since different species have distinct venation patterns, we selected those with percurrent leaf venation, i.e. there is one primary vein, with secondary veins extending from that primary vein, tertiary veins extending from one secondary to the next, and quaternary veins filling in the tertiary veins (Figs. 1, 2, 3). Water is transported from the roots to the stem, and from the stem to the
Figure 1. Image of a leaf of Magnolia x soulangeana illustrating primary, secondary, and tertiary veins.
Figure 2. Image of a leaf of Magnolia x soulangeana illustrating primary, secondary, tertiary, and quaternary veins.
Figure 3. Image of a leaf of Magnolia x soulangeana illustrating secondary (pink), tertiary (white), and quaternary (blue) areas.
Leoncavallo
The Manhattan Scientist, Series B, Volume 5 (2018)
105
primary vein of the leaf. Once in the leaf, water is transported through xylem cells in the veins. The following hypotheses were investigated: 1. For fully enlarged leaves, areas of entire leaves, secondary areas, and tertiary areas will be well-scaled. 2. For fully enlarged leaves, primary vein lengths, secondary vein lengths, and tertiary vein lengths will be well-scaled. 3. For enlarging leaves, entire areas, secondary areas, and tertiary areas will be well-scaled.
Materials and Methods Fully-enlarged leaf experiment The leaf samples were collected from the Manhattan College campus vicinity, extending as far as Van Cortlandt Park and Brust Park. In total, 23 different species of herbaceous plants were selected, and from these species the leaf samples were taken. The list of species is shown on (Table 1). For each species selected, samples were taken from two trees. Each selected leaf was considered to be fully enlarged. Once all of the samples were collected, they were taken back to the lab where their areas and veins were analyzed. Each species was identified using three sources (Kershner et al., 2008; www.tropicos.org and dendro.cnre.vt.edu). Table 1. Leaf areas in cm2 Species Amelanchier arborea (F. Michx.) Fernald Asclepias syriaca Betula alleghaniensis Britton Betula papyrifera Carpinus caroliniana Walter Carya tomentosa Catalpa speciosa (Warder) Engelm Cornus kousa BuĚ&#x2C6;rger ex Miq. Euphorbia pulcherrima Wild. Ex Klotzsch Hibiscus rosa-sinensis L. Hydrangea arborescens L. Lantana camara L. Liriodendron tulipifera L. Magnolia x soulangeana Soul. -Boud Malus pumila Mill. Morus rubra Lour. Ostrya virginiana Britton, Sterns & Poggen b. Phytolacca americana Salix nigra Marshall Tilia americana L. Tilia platyphyllos Scop Ulmus pumila L. Viburnum lentago L.
Primary
Secondary
Tertiary
66.7 235 67.4 44.9 43.5 113 311 29 26.4 82.6 60.8 16.4 59.8 125.9 58.7 80.5 26.4 197 83.4 66.8 74.2 7.25 120.4
2.35 6.61 1.53 2.14 2.304 3.37 13.9 2.62 1.3 6.29 3.35 1.025 4.48 5.27 4.12 2.88 1.37 9.2 2.11 2.81 3.62 0.262 5.79
0.335 0.417 0.135 0.224 0.189 0.503 2.770 0.195 0.189 0.767 0.255 0.119 0.433 1.162 0.273 0.247 0.144 2.900 0.243 0.234 0.225 0.0395 0.368
106
The Manhattan Scientist, Series B, Volume 5 (2018)
Leoncavallo
Leaf enlargement experiment For the enlargement component of this study, five leaf species were analyzed. For each species, six leaves were taken as samples, all of different sizes. The smallest leaf was labeled leaf #1, and the largest was labeled leaf #6. Leaf measurements Photographs of the leaves were uploaded to the computer to be analyzed. The computer program ImageJ was used to measure leaf areas and leaf vein lengths. Leaf areas were measured by tracing the perimeter of the entire leaf. Secondary areas were measured by tracing the perimeter of the area between two secondary veins. The same process was used for measuring tertiary areas. The primary vein length was measured as the length of the vein from the base of the leaf all the way to the tip of the leaf. The secondary veins were measured from the base of the primary vein to the outer edge of the leaf, and the tertiary veins were measured as the length from one secondary vein to the adjacent one. Secondary total length is defined as the sum of the lengths of all of the secondary veins within a leaf. Tertiary total length is defined as the sum of all of the tertiary veins within a leaf.
Results Fully-enlarged leaf experiment For the 23 species, entire leaf areas ranged from 7.25 cm2 to 310.9 cm2 , with a mean of 86.8 cm2 (Table 1). The average secondary area ranged from 0.26 cm2 to 14.0 cm2 , with a mean of 3.86 cm2 . It was found that as a leaf enlarges and the entire leaf area increases, the secondary areas grow at a related rate. The equation for this relationship is (y = 0.038x + 0.545; r2 = 0.86) (Fig. 4). For the 23 species studied, the average tertiary area ranged from 0.040 cm2 to 2.89 cm2 , with a mean of 0.54 cm2 . There is a strong relationship between entire leaf areas and tertiary areas. The equation for this relationship is (y = 0.008x − 0.178; r2 = 0.62) (Fig. 5). Secondary areas are well scaled to tertiary areas (y = 0.219x − 0.307; r2 = 0.77) (Fig. 6). 20
10
5
3.0
Tertiary areas (cm²)
15
3 2 1
2.0
1.0
0
0 0
100
200
0.0
0
300
Leaf areas (cm²) Fig. 4.
21
4
Tertiary areas (cm²)
Secondary areas (cm²)
19
100
200
300
5
10
Secondary areas (cm²) Fig. 6.
Fig. 5.
Figure 4. Relationship between total leaf area and secondary areas for 23 species of herbaceous plants (y = 0.038x + 0.55; r2 = 0.86).
0
Leaf areas (cm²)
Figure 5. Relationship between total leaf area and tertiary areas for 23 species of herbaceous plants (y = 0.008x − 0.18; r2 = 0.62).
Figure 6. Relationship between secondary areas and tertiary areas for 23 species of herbaceous plants (y = 0.22x − 0.31; r2 = 0.77).
15
Leoncavallo
The Manhattan Scientist, Series B, Volume 5 (2018)
107
In addition to leaf areas, leaf vein lengths and total lengths were analyzed for all 23 herbaceous plant species. The length of the primary vein ranged from 77.6 mm to 274 mm, with a mean of 136 mm (Table 2). For all the species studied, secondary total lengths were measured, ranging from 251 mm to 1520 mm, with a mean 747 mm. Total lengths of all secondary veins were well scaled to primary vein lengths (y = 5.48x − 0.97; r2 = 0.79) (Fig. 7). Total lengths of all tertiary veins within the leaves of the 23 species were also measured, and they ranged from 891 mm to 4620 mm, with a mean of 2040 mm. There is a strong relationship between secondary total lengths and tertiary total lengths (y = 2.93x − 144; r2 = 0.84) (Fig. 8). Table 2. Vein lengths in mm Species Amelanchier arborea (F. Michx.) Fernald Asclepias syriaca Betula alleghaniensis Britton Betula papyrifera Carpinus caroliniana Walter Carya tomentosa Catalpa speciosa (Warder) Engelm Cornus kousa Bürger ex Miq. Euphorbia pulcherrima Wild. Ex Klotzsch Hibiscus rosa-sinensis L. Hydrangea arborescens L. Lantana camara L. Liriodendron tulipifera L. Magnolia x soulangeana Soul. -Boud Malus pumila Mill. Morus rubra Lour. Ostrya virginiana Britton, Sterns & Poggen b. Phytolacca americana Salix nigra Marshall Tilia americana L. Tilia platyphyllos Scop Ulmus pumila L. Viburnum lentago L.
Primary
Secondary
Tertiary
Secondary Total
Tertiary Total
122.5 244 124 110 122 196 104 132 102 112 125 133 77.6 196 122 120 93.2 273 195 78.5 108 115 129
45.9 50.6 27.0 32.3 39.3 32.1 41.8 65.8 26.0 48.6 41.1 48.8 40.0 50.8 46.3 40.6 31.7 62.1 28.7 32.2 33.4 26.8 35.2
7.3 10.3 4.6 8.0 6.4 10.9 12.4 7.9 7.38 12.6 6.7 10.3 13.5 14.8 11.4 6.9 8.3 26.2 8.09 6.9 7.6 6.9 5.5
734 1517 1026 646 707 1090 251 658 415 486 740 586 400 914 648 731 507 1366 1148 515 601 858 634
1438 3878 2098 1704 1613 3518 967 1051 1063 1298 1487 1401 891 3374 1813 2070 1552 4617 3430 1435 2554 1766 1947
Leaf enlargement experiment For the enlargement aspect of the study, we focused on five herbaceous plant species: Betula alleghaniensis, Magnolia x soulangeana, Catalpa speciosa, Viburnum lentago, and Hibiscus rosa sinensis. It was found that as a leaf enlarges, so do its secondary areas. The slopes ranged from 0.021 to 0.067 (Fig. 9). Tertiary areas also enlarged as the total leaf area enlarged for the five herbaceous plant species studied. The tertiary leaf areas were well scaled to the total leaf areas. The slopes ranged from 0.002 to 0.012 (Fig. 10).
The Manhattan Scientist, Series B, Volume 5 (2018) 22 Secondary total lengths (mm)
108
Leoncavallo
1500
1000
500
0 0
100
200
300
Primary vein lengths (mm) Fig. 7.
Figure 7. Relationship between the primary vein lengths and the total lengths of all secondary veins within leaves of 23 herbaceous plants (y = 5.48x − 0.97; r2 = 0.79). Tertiary total lengths (mm)
23
5000 4000 3000 2000 1000 0 0
500
1000
1500
Secondary total lengths (mm) Fig. 8.
Figure 8. Relationship between the total lengths of all secondary veins and total lengths of all tertiary veins within leaves of 23 herbaceous plants (y = 2.93x − 144; r2 = 0.84). 24
Secondary areas avg. (cm²)
25 20 15 10 5 0 0
100
200
300
400
Leaf areas (cm²) Fig. 9. Figure 9. Secondary leaf areas vs. total leaf areas for five plant species during the leaf enlargement stages. Green symbols: Betula alleghaniensis - y = 0.021x + 0.26; r2 = 0.97; Orange symbols: Magnolia x soulangeana - y = 0.035x + 0.33; r2 = 0.99; Black symbols: Catalpa speciosa - y = 0.044x + 0.94; r2 = 0.98; Blue symbols: Viburnum lentago - y = 0.054x − 0.02; r2 = 0.99; Red symbols: Hibiscus rosa sinensis - y = 0.067x − 0.27; r2 = 0.98.
The Manhattan Scientist, Series B, Volume 5 (2018)
Tertiary areas avg (cm²)
Leoncavallo
25
109
5 4 3 2 1 0 0
100
200
300
400
500
600
Leaf areas (cm²) Fig. 10.areas vs. total leaf areas for five plant species during the leaf enlargement stages. Figure 10. Tertiary leaf Green symbols: Betula alleghaniensis - y = 0.004x + 0.01; r2 = 0.98; Orange symbols: Magnolia x soulangeana - y = 0.009x − 0.06; r2 = 0.98; Black symbols: Catalpa speciosa - y = 0.008x − 0.13; r2 = 0.99; Blue symbols: Viburnum lentago - y = 0.002x + 0.017; r2 = 0.95; Red symbols: Hibiscus rosa sinensis - y = 0.012x − 0.05; r2 = 0.98.
Discussion The purpose of this study was to determine the relationships between entire leaf areas and interveinal areas, and the relationships between vein lengths. 23 species were tested, with a wide range of entire leaf areas. The smallest species tested was Ulmus pumila with an entire leaf area of 7.25 cm2 , while the largest species tested was Catalpa speciosa with an entire leaf area of 311 cm2 . Despite this large difference, the r2 values for relationships between areas and relationships between leaf vein lengths were surprisingly similar. For the 23 species in the fully-enlarged portion of the study, leaf areas and leaf vein lengths were determined. Significant relationships were found between the areas, with r2 values between 0.62 and 0.86, as well as between leaf vein lengths, with r2 values of 0.79 and 0.84. For the five species studied in the enlargement portion of the study, leaf areas for six samples of different sizes were determined. We found significant relationships between entire leaf areas and secondary areas as the leaf enlarges, with slopes ranging from 0.021 to 0.067. This is a noteworthy 2 to 1 relationship. As for the relationship between entire leaf areas and tertiary areas, the slopes ranged from 0.002 to 0.012, with a noteworthy 3 to 2 relationship. Throughout all five species tested, as the leaves enlarged, so did the secondary and tertiary areas. In the future, we plan to expand the number of species tested for both the fully-enlarged component of the study and the enlargement component. It is hoped that as more species are added, the r2 values will remain high and will further reinforce that there are significant relationships between leaf areas and between leaf vein lengths. In addition, xylem cell conductivity will be
110
The Manhattan Scientist, Series B, Volume 5 (2018)
Leoncavallo
studied with the hope that one can develop an algorithm for the transportation of water through vein lengths. With the knowledge of conductivity, we will develop a better understanding of how veins efficiently and evenly distribute water throughout leaves.
Acknowledgments The author is indebted to the Catherine and Robert Fenton Endowed Chair in Biology to Dr. Lance S. Evans for financial support for this research.
References Hopkins, G.W., and P. N. Huner 2009. Plant Physiology. John Wiley & Sons, Inc. Kershner, B., D. Matthews, and G. Nelson. 2008. Field Guide to Trees of North America. Sterling Publ. Co., New York. Lambers, H., F.S. Chapin, and T.L. Pons. 1998. Plant Physiological Ecology. New York: Springer
Machine learning algorithms to determine bark coverage on saguaro cacti (Carnegiea gigantea) Marissa LoCastro∗ Laboratory of Plant Morphogenesis, Department of Biology, Manhattan College Abstract. Bark coverage on surfaces of more than 22 species of tall, long-lived cacti has been found throughout the Americas, and has been shown to lead to premature death of specific cactus species. This study utilized data of bark coverage on 1149 cactus plants of saguaro cacti from 50 desert field plots in Tucson Mountain Park, Tucson, Arizona. These cactus plants were observed for four periods, 1994, 2002, 2010, and 2017. The analysis of this research was completed after the 2017 sampling. Previous research has demonstrated that bark coverage starts on south-facing surfaces and progresses to north-facing surfaces. For the current research, data of bark coverage was processed with WEKA 3.8, a standard machine-learning program, to determine the degree that bark coverage on one set of surfaces can predict bark coverage on other surfaces. Decision trees were created by the J48 tree classifier and confusion matrices were produced to determine the accuracy of decision tree predictions. Cacti were subdivided into bark coverage groups for this analysis. Cacti in Group A had minimum bark coverage. Previous research has shown that once north-facing right trough is 80% covered in bark, the cactus plant will die within 8 years. Cacti in Group B had this maximum amount of bark coverage prior to cactus death in 8 years. Cacti in Group C were already dead. The purpose of this research was to use WEKA 3.8 machine learning program to use data from 1994, 2002, 2010, and 2017 to predict bark accumulation on cactus plants and cactus death. From this analysis, this study led to the analysis of crests verse troughs are predictor surfaces of bark accumulation and cactus death. The obtained data supports the overall hypotheses that the amounts of bark coverage, and specifically bark coverage on trough surfaces, can be used to determine the likelihood of bark accumulation and the likelihood of cactus death.
Introduction Young saguaro cacti, like other cactus species, have green surfaces. In recent years, older cactus plants have exhibited bark coverage (Fig. 1) (Duriscoe and Graban, 1992; Evans et al., 1994a; 1994b; 1994c; 1995; Evans, 2005; Turner and Funicelli, 2000; Evans and Macri, 2008; Danzer and Drezner, 2014; De Bonis et al., 2017). Previous research has shown that bark coverage (epidermal browning) of surfaces of saguaro cacti causes premature death. This bark formation has been documented by many researchers. Photographs prior to the 1950’s show little to no bark on surfaces of saguaros (many books are available at www.arizonahistocialsociety.org; 949 E. 2nd St., Tucson, AZ 85719; Evans et al., 1992). The same photographs prior to the 1950s show very tall (presumed to be old) saguaros, that typically live to be 200 to 300 years of age (Steenbergh and Lowe, 1977; Danzer and Drezner, 2014). A survey of saguaro cacti in Saguaro National Park in 2010, estimated that no saguaro was older than 110 year (Steenbergh and Lowe, 1977). Previous research has also shown that bark coverage on saguaros leads to premature death. Tucson, Arizona is located at a latitude of 32.2◦ N. Because of this location, the-facing surfaces of the cactus plants receive four times more sunlight compared to the north-facing surfaces (Fig. 2). Previous research has shown that bark coverage starts on south-facing surfaces for saguaros ∗
Research mentored by Lance Evans, Ph.D.
112
The Manhattan Scientist, Series B, Volume 5 (2018)
LoCastro
Figure 1. Cactus plants located in Tucson Mountain Park in Tucson, Arizona. On the left is a photo of a healthy cactus plant with minimal bark coverage and exposed green surfaces. On the right is a photo of an unhealthy cactus plant that has brown bark accumulating on its surfaces. Fig 3.
Fig 2.
Figure 2. Cross section of a cactus plant. This figure represents the latitude of Tucson Mountain Park which is 32â&#x2014;Ś North. Because of the location, the sun exposed on cactus plants is four times greater on the south facing surfaces than the north facing surfaces.
Figure 3. Radial bark accumulation on the stem of a saguaro cactus plant. The bark accumulation begins on the south-facing crest surfaces, then accumulates radially around the east and west-facing surfaces, reaching the north-facing surfaces last.
(Evans et al., 1992; 1994a). Over time, south-facing surfaces accumulate more bark and other surfaces have enhanced bark coverage. Of the surfaces for which data are available, north-facing right troughs are the last surfaces to accumulate bark (Figs. 3 and 4). Once north-facing right
LoCastro
The Manhattan Scientist, Series B, Volume 5 (2018)
113
troughs have more than 80% bark coverage, there is a 90% probability that the saguaro will die within eight years (Evans and De Bonis, 2015). A buildup of epicuticular waxes on a cactus surface is the first visible sign that bark coverage occurs. The buildup of epicuticular waxes leads to less gas exchange of the tissues. Less exchange is coincident with cell divisions of epidermal cells that eventually produce bark (Evans et al., 1994a; 1994b).
Figure 4. Close-up photos of the trough surfaces in each cardinal direction. The south-facing surfaces have the most bark coverage, followed by the east and west-facing surfaces, and the north-facing surfaces have the least bark coverage on its trough surfaces.
Similar bark formation occurs on over 20 species of tall, long-lived columnar cactus species like saguaros (Evans, 2005; Evans and Macri, 2008; Evans et al., 1994a; 1994b; 1994c). All species in the North Hemisphere show bark coverage beginning on south-facing surfaces (Evans et al., 1992; 1994a) while all species in the Southern Hemisphere show bark coverage beginning on north-facing surfaces (Evans et al., 1994c). Controlled exposures to UV-B resulted in a buildup of epicuticular waxes similar to the response of nature sunlight suggesting that the cause of the bark formation on these long-lived columnar cacti is UV-B from sunlight (Evans et al., 2005). Since the north-right trough is the last surface to have bark coverage and because previous research has shown that there is a 90% probability that a saguaro will die in eight years if the north-facing right troughs has more than 80% bark coverage, experiments were designed to predict outcomes of bark coverage and cactus death based upon the percentage of bark coverage on north-right troughs. The hypotheses of this study were that (1) bark coverages on cactus surfaces can predict changes in bark coverage on north-facing right troughs and predict cactus death. These hypotheses led to additional hypotheses that (2) trough surfaces are better predictors of bark coverage on the north-facing right trough and (3) the sum of the trough surfaces can predict cactus death.
114
The Manhattan Scientist, Series B, Volume 5 (2018)
LoCastro
Materials and Methods Cactus plants evaluated Saguaro cacti native to Tucson Mountain Park (32.2◦ N, 111.1◦ W) were analyzed in this study. In 1994, 50 permanent plots with 1149 cacti were randomly selected. The selected cacti were all taller than 4 meters (Evans et al., 2005). GPS data along with anatomical features of each cacti, nearby vegetation, and topographical features were used to identify and distinguish cacti each study evaluation. The cactus plants evaluated in 1994 were re-evaluated in 2002, 2010, and 2017. Data from the four-time periods were compared (Evans et al., 2005). Evaluation of each cactus Saguaro cacti have 20 to 23 ribs. Each rib has a protrusion (crests) and a concave portion on each side of each protrusion. These concave portions were termed troughs. Crests and troughs were used to relate to wave theory. For all evaluations, a single crest closest to south, east, north and west was evaluated (Evans et al., 1994a; 1994b; 1995). The trough to the right and left of the crest was also evaluated. For each cactus, twelve surfaces were evaluated. For each surface, an 8 cm (Evans et al., 1994a; 1994b; 1995) vertical strip of surface at 1.75 meters from the ground was evaluated for percent bark coverage (Evans et al., 2013). The percentage bark coverage was determined visually for each surface. A previous study demonstrated that visual estimates were not different from digitally determine estimates. The percentage of bark coverage for each surface was entered into Microsoft Excel files. For the 2017 survey, and photographs of each surface were archived. The excel data file had data of the twelve surfaces for the four time periods. Data evaluations Previously, evaluations of data were completed by putting each cactus plant in a group or class based upon the percentage of bark coverage on south-facing crests (Evans et al., 2013). South-facing crests were used for classing because bark coverage starts on south-facing crests. Previous results have shown that north-facing right troughs were the last surfaces to have bark coverage. For this study, cacti were placed in groups based upon percentage of bark coverage on north-facing right troughs. The groups are given below Group A – Minimal bark coverage - Bark coverage of 0 – 19% on north-facing right troughs. Group B – Maximum bark coverage - Bark coverage of 80 – 100% north-facing right troughs. Group C – Cactus dead. Previous research has shown that once the north-facing right trough is covered in 80% bark, the cactus plant will die in 8 years (Evans and DeBonis, 2015). For this study, this was the criteria used as maximum bark because this is the maximum bark coverage on the north-facing right trough prior to inevitable cactus death.
LoCastro
The Manhattan Scientist, Series B, Volume 5 (2018)
115
Theory of WEKA 3.8 WEKA 3.8 is a Machine Learning program that was used to analyze data sets. WEKA generates decision trees to predict amounts of bark coverage (Hall et al., 2009). Classifiers, or criteria, were generated from a small subset of the entire dataset. WEKA 3.8 used the classifier J48 tree. The J48 tree implements the algorithm ID3 to predict target variables in a dataset (Hitchcock, 1971). The J48 tree is a high accuracy classifier (Hyafil and Rivest, 1976). Algorithm of MATLAB Validate Model The MATLAB program called Random Forest created a figure that displays which surfaces best predict cactus death. Then, since the north-facing right trough is the best predictor of cactus death, the program can be utilized to predict the percentage of bark formation on the north-facing right trough. For example, a subset of surfaces can be manually input into the program and it will output at what accuracy those surfaces predict bark coverage on the north-facing right trough. This process utilized data from the Master File excel sheer, which had data for bark percentages on all twelve surfaces in 1994, 2002, 2010, and in 2017. The algorithm used 10-fold validation on a predictor model to predict the percent of bark formation on one of the twelve surfaces of a cactus, based on the predictor surfaces input into the program. Input Arguments: • Predictor surfaces to be used to predict one of the twelve surfaces. • The one surface out of the twelve that is to be predicted. • The multivariate logistic model used (De Bonis et al., 2017). The logistic model predicted a mean bark formation rate for each surface. It used data from each time period, including 1994, 2002, and 2010. Steps in algorithm: Reads in percentage of bark data for all cactus plants. Puts data in 10 folds. Each fold is made of 10% of the entire data set. It creates a model using 9 out of the 10 folds for each of the folds created in the previous step. The fold that is left out in the previous step is created its own model using the other 9 folds of data. • An error is recorded for each cactus in the left out fold. The error is between the predicted value by the program and the actual value in the data sheet. • Once all ten folds are evaluated through the leave one out method, a mean error and standard deviation for all predictors is generated by the program. • • • •
Outputs: • Mean error of the surfaces used to predict bark coverage. • Standard error of the surfaces used to predict bark coverage. • Percentage of errors that are less that 5% from the surfaces used to predict bark coverage.
116
The Manhattan Scientist, Series B, Volume 5 (2018)
LoCastro
Bark accumulation and cactus death analysis Each cactus was assigned to a group as stated above. Data of all cacti in each group were evaluated as a group. Groups were evaluated for three periods: 1994 to 2002, 2002 to 2010 and 2010 to 2017. For example, cacti in Group A were placed in WEKA 3.8 to determine the cacti that remained in Group and those that moved to Group B. WEKA 3.8 determined the percentages of bark on the twelve surfaces that best predicted cacti that should go to the two groups. This process was done for many groups over many time periods. Predictor surfaces analysis The entire data set of bark coverages on each cactus surface was split into two groups. The first subset consisted of only trough surfaces. The second subset consisted only of crest surfaces. Each subset was input into MATLABâ&#x20AC;&#x2122;s Validate Model and was input into WEKA 3.8. MATLAB Validate model produced a standard deviation and a standard error value for each subset. These values are used to see how well each subset is able to predict bark coverage on the north-facing right trough. WEKA 3.8 is used to see if the input surfaces are able to create a logical path using the input data to predict bark accumulation or cactus death. Sum of trough surfaces analysis The bark coverage on the east-facing right trough, the east-facing left trough, the west-facing right trough, and the west-facing left trough was added together and added to the excel Master File as another feature that can be used to predict cactus death. The new files were entered into WEKA 3.8 with the addition of the sum feature. The Master File was first divided into three groups. Cactus plants that fit the criteria for alive using only sum. Cactus plants that fit the criteria for dead using only sum. Cactus plants that are in between these criteria and cannot be termed alive or dead. These three groups were compared using t-test analysis on each surfaceâ&#x20AC;&#x2122;s bark coverage, as well as the sum of the four trough surfaces. The t-test was used to see if the three groups are statistically significant from each other.
Results The overall purpose of this research to predict the death of saguaro cacti (Carnegiea gigantea) accurately. The first hypothesis made was that bark coverages on several cactus surfaces predict bark coverage on north-facing right troughs and predict cactus death. Data in Table 1 show that a variety of cactus surfaces can predict bark coverage on north-facing right troughs and cactus death. This data was obtained from WEKA 3.8 for the years 1994 to 2002, 2002 to 2010, and 2010 to 2017. When data of cacti with 0 to 20% bark coverage on north right troughs were used to predict bark coverages between 20 and 80%, the decision trees had accuracies of 90% (Fig. 5). When data of cacti with 0 to 20% bark coverage on north right troughs were used to predict bark coverages greater than 80%, the decision trees had accuracies of 98% (Fig. 6). When data of cacti with 0 to 20% bark coverage on north right troughs were used to predict cactus death, the decision trees had accuracies greater than 99% (Fig. 7).
LoCastro
The Manhattan Scientist, Series B, Volume 5 (2018)
117
Fig 5.
North-left trough >10
≤10 No bark accumulation
West-left trough ≤15
>15
North-left trough ≤25 No bark accumulation
Fig 6.
North-left trough
Bark accumulation
>25
East-right trough ≤80
No bark accumulation
>45
≤45 No bark accumulation
>80
West-right trough ≤95
>95
West-left trough
East-right trough ≤12
≤95 No bark accumulation
Maximum bark accumulation
>12
>95
Bark accumulation
No bark accumulation
Figure 5. A decision tree produced by WEKA 3.8. This decision tree utilized the entire data base of cactus plants to predict whether a cactus plant that originally has less than 20% bark coverage on its north-facing right trough will accumulate bark, but not exceed bark accumulation of 80% on its northfacing right trough. This prediction is over the sampling period of 1994 to 2002. The numbers adjacent to each arrow are the bark percentage cutoff numbers of the surface above to distinguish between alive cacti and dead cacti. This decision tree is 89.9% accurate.
Maximum bark accumulation
Figure 6. A decision tree produced by WEKA 3.8. This decision tree utilized the entire data base of cactus plants to predict whether a cactus plant that originally has less than 20% bark coverage on its north-facing right trough will accumulate bark that exceeds 80% on its north-facing right trough. This prediction is over the sampling period of 1994 to 2002. The numbers adjacent to each arrow are the bark percentage cutoff numbers of the surface above to distinguish between alive cacti and dead cacti. This decision tree is 98.3% accurate.
Fig 7.
North-left trough ≤90
Alive
>90
Dead
Figure 7. A decision tree produced by WEKA 3.8. This decision tree utilized the entire data base of cactus plants to predict whether a cactus plant that originally has less than 20% bark coverage on its north-facing right trough will die in the next sampling period. This prediction is over the sampling period of 2010 to 2017. The numbers adjacent to each arrow are the bark percentage cutoff numbers of the surface above to distinguish between alive cacti and dead cacti. This decision tree is 100% accurate.
118
The Manhattan Scientist, Series B, Volume 5 (2018)
LoCastro
Table 1. Confusion matrices and predictive surfaces of the WEKA 3.8 decision trees used to predict bark coverage on the north-facing right troughs and cactus death of saguaro cacti (Carnegiea gigantea)
Initial Correct
Confusion Matrices Initial Final Incorrect Incorrect
Final Correct
Calculated Predictive Accuracies
Minimum bark coverage (0 to 20%) predicting increase in bark coverage (20 to 80%) 1994 – 2002 2002 – 2010 2010 – 2017
NL,1 WL,2 NL, ER,3 WL 10, 15, 25, 80, 12 NL, WL, NC,4 WL, WR 16, 50, 75, 85, 92 NL, WL, WR, WR, WR 10, 17, 25, 40, 55
539
40
27
55
89.9
476
11
32
11
91.9
385
3
27
5
90.9
Minimum bark coverage (0 to 20%) predicting maximum increase in bark coverage (>80%) 1994 – 2002 2002 – 2010 2010 – 2017
NL, WR,5 45, 95, 95 NC, NL, WL 85, 8, 20 NO TREE
573
6
4
5
98.3
480
7
6
3
97.4
387
1
5
0
98.5
0
109
99.7
2
49
99.4
0
39
100.0
Minimum bark (0 to 20%) coverage predicting cactus death 1994 – 2002 NL 577 2 95 2002 – 2010 NL 486 1 90 2010 – 2017 NL 387 1 90 1 4
NL = north-facing left trough; 2 WL =west-facing left trough; 3 ER = east-facing right trough; NC = north-facing crest; 5 WR = west-facing right trough;
Table 2. Comparison between the trough surfaces and the crest surfaces as predictors of bark accumulation on the north-facing right trough and cactus death of saguaro cacti (Carnegiea gigantea). Data is presented from MATLAB Validate Model machine learning program and WEKA 3.8 machine learning program. Crest Surfaces 1
MATLAB Standard Deviation MATLAB Standard Error WEKA 3.8 Path Produced 1
2
Trough Surfaces 3
4
SC, EC, NC, WC 21.8 0.3 Never
ER,5 EL,6 NL,7 WR,8 WL9 14.6 0.2 Always
SC = south-facing crest; 2 EC = east-facing crest; 3 NC = north-facing crest; WC = west-facing crest; 5 ER = east-facing right trough; 6 EL = east-facing left trough; 7 NL = north-facing left trough; 8 WR = west-facing right trough; 9 WL = west-facing left trough
4
LoCastro
The Manhattan Scientist, Series B, Volume 5 (2018)
119
Data in Table 1 and Figs. 5, 6, and 7 show that trough surfaces were used to predict bark accumulation on north right troughs, as well as predict cactus death. The above data lead to the second hypothesis that troughs are better predictors of bark coverage and cactus death than crests (Table 2). The MATLAB Validate Model machine learning program gave standard deviations of 21.8 and 14.6 for crests and troughs, respectively. The lower the standard deviation value for troughs indicate troughs were better predictors then crests. In addition, WEKA 3.8 analysis with crest surfaces only never provided decision trees, while trough surfaces always provided decision trees. Since troughs were the best predictors of bark coverage and cactus death, analyses were performed with the data base to determine if the sum of east and west troughs alone were good predictors of whether cacti lived or died. Within the data base there were cacti from 1994 that were alive and dead in 2002. In addition, there were cacti in 2002 that were alive and dead in 2010. Likewise, there were cacti in 2010 that were alive and dead in 2017. These three populations were selected for the following analysis. The next step was to substitute ‘sum’ values for all other surfaces except for north right troughs. Data in Fig. 8 show that with 100% accuracy, cacti with ‘sum’ values less than 108 were always alive, sum values greater than 324 were always dead, while values intermediate between 108 and 324 could not be predicted accurately. For these three population, the mean bark coverage values are shown in Table 3. Excluding north-left troughs and west-left troughs, bark coverages of all surfaces were greater than 90% for cacti predicted to be dead. In contrast, bark coverages for east, north, and west surfaces of alive cacti were all less than 35%. Fig 8.
SUM of east and west trough surfaces ≤108
Alive
108 < x ≤ 324
Middle (Cannot be predicted)
>324
Dead
Figure 8. This figure utilizes the sum of the bark coverage on the east-facing right trough, the eastfacing left trough, the west-facing right trough, and the west-facing left trough to predict if a cactus plant is alive or dead. The numbers adjacent to each arrow are the bark percentage cutoff numbers of the surface above to distinguish between alive cacti and dead cacti. This figure is 100% accurate. It divides the cactus population into cacti that will remain alive, cacti that will die, and cacti that cannot be predicted.
Since the ‘sum’ provided excellent predictive abilities to predict alive and dead cacti, the ‘sum’ category was added to the entire data base. The new file processed through WEKA 3.8 gave results in Fig. 9 in which the ‘sum’ category was not present. Fig. 9 decision tree is 97% accurate. The next step was to remove the east and west troughs since they were used to make ‘sum’ category. The new file processed through WEKA 3.8 resulted in Fig. 10, in which the ‘sum’ category was present and provided 97% accuracy. These data confirm that east and west troughs are the best predictors of alive/death of saguaro cacti if, north left troughs are not included
120
The Manhattan Scientist, Series B, Volume 5 (2018)
LoCastro
Table 3. Mean bark coverages on surfaces of saguaro cactus plants (Carnegiea gigantea). Bark coverage values are for cactus plants that meet the criteria cacti that will remain alive, cacti that will die, and cacti that cannot be predicted as alive or dead (p < 0.01).
Alive n = 900 Middle n = 446 Dead n = 100
SC1
SR2
SL3
EC4
ER5
EL6
NC7
NL8
WC9
WR10
WL11
SUM12
50
19
20
32
8
13
16
5
29
11
7
40
89
66
68
84
46
70
50
19
77
58
27
202
100
97
98
99
94
99
91
60
99
98
86
377
1
SC = south-facing crest; 2 SR = south-facing right trough; 3 SL =south-facing left trough; EC = east-facing crest; 5 ER = east-facing right trough; 6 EL = east-facing left trough; 7 NC = north-facing crest; 8 NL = north-facing left trough; 9 WC = west-facing crest; 10 WR =west-facing right trough; 11 WL = west-facing left trough; 12 SUM = east and west-facing troughs 4
Fig 9.
North-left trough ≤60
>60
West-right trough
Alive
≤50
East-left trough ≤65
Alive
>50
East crest
>65
>98
Dead
≤98
Dead
East-left trough ≤97
Alive
>97
Dead
Figure 9. This figure is a decision tree produced by WEKA 3.8. It utilized the entire data base of cactus plants to predict whether a cactus plant is alive or dead. The tree was generated using all cactus surfaces and the sum of the trough surfaces. The trough surfaces bark coverage that were added together are the east-facing right trough, the eastfacing left trough, the west-facing right trough, and the west-facing left trough. The numbers adjacent to each arrow are the bark percentage cutoff numbers of the surface above to distinguish between alive cacti and dead cacti. This decision tree is 97% accurate.
LoCastro
The Manhattan Scientist, Series B, Volume 5 (2018)
121
Fig 10. North-left trough ≤60
>60
North crest
Alive
≤85
>85
East crest
Alive
>98
≤98
North-left trough
Alive ≤80
sum
>80
South-right trough
>275 ≤275
Dead
Alive
>72
≤72
South-left trough
Dead ≤95
>95
North crest
Alive
≤90
>90
North crest
Dead
≤94
>94
Alive
North-left trough ≤90
>90
North-left trough ≤87
sum ≤375
Alive
>375
Dead
Figure 10. This figure is a decision tree produced by WEKA 3.8. It utilized the entire data base of cactus plants to predict whether a cactus plant is alive or dead. The tree was generated by omitted the surfaces used to create the sum feature, but still included the sum of the surfaces. This means that the following features were used to generate this decision tree: southfacing right trough, south-facing left trough, southfacing crest, east-facing crest, west-facing crest, northfacing left trough, north-facing crest, and sum of the omitted crest surfaces. The numbers adjacent to each arrow are the bark percentage cutoff numbers of the surface above to distinguish between alive cacti and dead cacti. This decision tree is 97% accurate.
Alive
>87
sum >397
Dead
≤397
Alive
Discussion Bark accumulation on saguaros is identical to bark accumulation on twenty other species of columnar cacti around the Americas Columnar cacti in northern Mexico show extensive bark on south-facing stems and columnar cacti in Ecuador show the same characteristic barking due to exposure to sunlight (Evans et al., 1992; Evans et al., 1994c; Evans, 2005; Evans and Macri, 2008). For all these cactus plants, bark accumulation occurred on aged cacti, not small cacti. This is because the older cactus plants are exposed to sunlight for longer periods of time. From previous analysis, it is conclusive that bark accumulation on saguaro cactus plants begins on south-facing surfaces before all other surfaces (De Bonis et al., 2017). Bark accumulation on all surfaces other than the south-facing surfaces are referred to as accumulating after the southfacing surfaces, specifically the south facing crest. Prior to complete bark coverage on south-
122
The Manhattan Scientist, Series B, Volume 5 (2018)
LoCastro
facing surfaces, the east and west-facing surfaces begin to accumulate bark. Because of the south inclination of east-left trough, they begin to accumulate bark three years following the bark build up on their respective crest. In contrast to this, the east-right trough has a six year delay in bark accumulation as compared to its respective crest (DeBonis et al., 2015). As the cactus plants accumulate bark radially around their stems there is an exponential increase in bark accumulation following the bark formation on the south crest. The north-facing right trough is the last surface to accumulate bark before cactus death. Because the north-facing right trough is the last surface to be damaged by sunlight, machine learning programs predict bark coverage on the north-facing right trough and predict cactus death. The data base for this research consists of over 55,000 data points. With a data base that large, machine learning programs are required to manipulate and understand the data. WEKA 3.8 and MATLAB algorithms were used to analyze this large data base. Throughout the entire 24-year period of study, there is characteristic surfaces that can be used to predict the increase in bark accumulation following initial bark growth on south-facing crests. The north-left facing trough is repeatedly the determining surface of the amount of bark accumulation that will occur over each 8-year sampling interval. Following the north-facing left trough, other trough surfaces were almost always selected as major predictor surfaces. Bark accumulation of a saguaro cacti over an 8-year period can be analyzed to identify a predictive pattern of bark progression and overall health degradation of a cactus plant. The average accuracy with which WEKA 3.8 can predict cactus death, is almost 10% higher then when WEKA 3.8 predicts bark accumulation that does not exceed 80%. WEKA 3.8 is a more accurate predictor of cactus death, in comparison to its predictions regarding bark accumulation. MATLAB algorithms and WEKA 3.8 confirm that trough surfaces are better predictor surfaces of bark accumulation and cactus death. The ‘sum’ of bark coverage east and west-facing trough surfaces splits the cacti population into three groups: cacti that are going to live, cacti that are going to die, and cacti that cannot be predicted. Any cacti that have a ‘sum’ of east and west trough surfaces that is less than 108, or 27% bark coverage, will remain alive. Any cacti that have a ‘sum’ of east and west trough surfaces that is greater than 324, or 81% bark coverage, will die. Cacti that have a bark coverage that is greater than 27% but less than 81% cannot be accurately predicted as alive or dead. The three groups, alive, dead, and unpredictable, are all statistically significant. The ‘sum’ of the east and west-facing troughs is not utilized in the path to predict cactus death when added to the original data base. The ‘sum’ feature is utilized only when the east and west-facing troughs are removed from the database. The east and west troughs are required to predict cactus death. When they are available as a predictive feature they are used by WEKA 3.8 and when they are removed, but the ‘sum’ feature remains, WEKA 3.8 utilizes the ‘sum’ of the removed troughs to predict cactus death. Therefore, the east and west troughs are always used as predictive surfaces to a certain degree.
LoCastro
The Manhattan Scientist, Series B, Volume 5 (2018)
123
This study concluded that WEKA 3.8 and MATLAB algorithms can be utilized to prove that trough surfaces are the major predictors of cactus death. More specifically, east and west trough surfaces can be used to predict cactus death with an accuracy of 97%. This study can be further continued to isolate the major predictors involved in bark progression. Ultimately, this understanding can be used to isolate characteristics of extremely unhealthy cacti right before mortality to fully understand the premature death of saguaros. Saguaro cacti may live to over 200 years of age and the predictable rates of bark accumulation show complete bark coverage within 45 years of initial bark accumulation on the south-facing crest (Steenbergh and Lowe, 1977; Danzer and Drenzer, 2014)). This 45-year life span is inconsistent with the predicted 200 years of life for the saguaro cactus plants (Oâ&#x20AC;&#x2122;Brien et al., 2011). Turner and Funicelli (2000) demonstrated that 16.5% of saguaro cacti in Tucson Mountain Park died in a 10-year period from 1990 to 2000. This is consistent with the bark accumulation mortality rates on adult saguaros of 2.3% and 2.1% per year over the past several decades (Evans et al., 2005; 2013; Turner and Funicelli, 2000). The high mortality rate of adult saguaros coupled with the fact that the oldest saguaros in Tucson Mountain Park was only 110 in 2010 suggests there are detrimental changes for adult saguaro cacti in Tucson, Arizona. Surveys of saguaros do not acknowledge epidermal browning to be cause of mortality and the circumstances are not unique to Tucson Mountain Parkâ&#x20AC;&#x2122;s saguaros but is rather happening to twenty species of columnar cacti spread across the Americas. If this epidemic begins to affect younger cactus plants, prior to the age of 100 years of life, and the expected life-span is only 45 years after initial browning, it is conclusive that there will be minimal to no adult saguaro cactus plants left (Evans and DeBonis, 2013; DeBonis et al., 2017).
Acknowledgement The author appreciates the financial support of the Catherine and Robert Fenton Biology Endowed Chair to Dr. L. S. Evans.
References Danzer, S. and T.D. Drenzer. 2014. Relationship between epidermal browning, girdling, damage, bird cavities in a Military Restricted Database of 12,000+ plants of the keystone Carnegiea gigantean in the northern Sonoran Desert. Madrono 61: 115-125. De Bonis, M.L. Barton and L.S. Evans. 2017. Rates of bark formation on surfaces of saguaro cacti (Carnegiea gigantea). Journal of the Torrey Botanical Society. 144: 1-8. Duriscoe, D.M. and S.J. Graban. 1992. Epidermal browning and population dynamics of giant saguaros in long-term monitoring plots, p. 237-262. In C.P. Stone and S. Bellantoni [eds]. Proceedings of the Symposium of Research in Saguaro National Monument. Southwest Parks and Monuments Association. Tucson, AZ.
124
The Manhattan Scientist, Series B, Volume 5 (2018)
LoCastro
Evans L.S. 2005. Stem surface injuries to Neobuxbaumia tetetzo and Neobuxbaumia mezcalaensis of the Tehaucan Valley of Central Mexico. J. Torrey Bot. Soc. 132: 33-37. Evans, L.S., P. Boothe, and A. Baez. 2013. Predicting morbidity and mortality for a saguaro cactus (Carnegiea gigantea) population. J. Torrey Bot. Soc. 140: 247-255. Evans, L.S., V.A. Cantarella, L. Kaszczak, S. M. Krempasky, and K.H. Thompson. 1994b. Epidermal browning of saguaro cacti (Carnegiea gigantea). Phyiological effects, rates of browning and relation to sun/shade conditions. Environ. Exp. Bot. 34: 107-115. Evans, L.S., V.A. Cantarella, K.W. Stolte and K.H. Thompson. 1994a. Phenological changes associated with epidermal browning of saguaro cacti at Saguaro National Monument. Environ. Exp. Bot. 34: 9-17. Evans, L.S., and M. DeBonis. 2015. Predicting Morbidity and Mortality of Saguaro Cacti (Carnegiea gigantea) J. Torrey Bot. Soc. 142: 231-239. Evans, L.S., K.A. Howard and E. Stolze. 1992. Epidermal Browning of Saguaro Cacti (Carnegiea gigantea) J. Torrey Bot. Soc. 142: 231-239. Evans, L.S., C. McKenna, R. Ginocchio, G. Montenegro, and Roberto Keisling. 1994c. Surficial injuries to several cacti of South America. Environ. Exp. Bot. 35: 105-117. Evans, L. S., and A. Macri. 2008. Stem surface injuries of several species of columnar cacti of Ecuador. J. Torrey Bot. Soc. 142: 231-239 Evans, L.S., V. Sahi, and S. Ghersini. 1995. Epidermal browning of saguaro cacti (Carnegiea gigantea) : Relative health and rates of surficial injuries of a population. Environ. Exp. Bot. 35: 557â&#x20AC;&#x201C;562. Evans, L.S., A.J. Young, and Sr. J. Harnett. 2005. Changes in the scale and bark stem surfaces injuries and mortality rates of a saguaro (Carnegiea gigantea) cacti population in Tucson Mountain Park. Can. J. Bot. 83: 311-319. Hall, M.E. Frank, G. Holmes, B. Pfrahringer, P. Reutemann, and I. Written. 2009. The WEKA Data Mining Software: An Update, SIGKEE Explorations, Volume 11, Issue 1. Hitchcock, A. 1971. Manual of the Grasses of the United States. Dover Publishers, New York, NY. Hyafil, L. and R. Rivest. 1976. Constructing optimal binary decision trees in NP-complete. Inform. Process. Lett. 5:15-17. Oâ&#x20AC;&#x2122;Brien, K., D. Swann, and A. Springer. 2011. Results of the 2010 saguaro consensus at Saguaro National Park. National Park Service, U.S. Department of Interior, Tucson, AZ. 49 p. Steenbergh, W.F. and C.H. Lowe 1977. Ecology of the Saguaro II Reproduction, germination, establishment, growth and survival of the young plant. National Park Service Monograph Series Eight. Turner, D. and C. Funicelli. 2000. Ten-year resurvery of epidermal browning and population structure of saguaro cactus (Carnegiea gigantea) in Saguaro National Park.
Determination of the temporal and spatial variation of Giardia lamblia in ribbed mussels (Geukensia demissa) from Bronx, NY Alexa Marcazzo∗ Department of Biology, Manhattan College Abstract. Bivalves are filter feeders which are able to trap parasites in their tissues. We have previously shown evidence of human intestinal parasites in several bivalve species collected in New York City beaches. The purpose of this project is to determine the temporal and spatial variation of Giardia lamblia (also known as Giardia duodenalis and Giardia intestinalis) in bivalves collected from beaches in New York City in 2017. Geukensia demissa was collected from Clason Point and Orchard Beach in Bronx, New York, in the fall of 2017. Each sample was dissected to collect the digestive gland, mantle, gill, abductor muscle, foot, and hemolymph. DNA was extracted from these tissues, and PCR was performed with two specific primers to detect the presence of the β-Giardin gene, found only in Giardia lamblia. From the data collected thus far, the prevalence of Giardia lamblia in Geukensia demissa collected from Orchard Beach and Clason Point was found to be 42.0% and 36.1%, respectively, indicating similar G. lamblia prevalences at these two locations, which are 10 miles apart. This is different from what was observed in 2016 when the prevalence of G. lamblia was significantly higher at Orchard beach (28%) than at Clason Point (19%). G. lamblia assemblage A, usually associated with humans, was previously found in the tissues of bivalves collected in 2014. We will be in the position to obtain the genotype of G. lamblia from the bivalves collected at both locations. Our data indicates that bivalves can be considered bio-sentinels for human intestinal parasites in aquatic environments.
Introduction Biosentinels, or bioindicators, are species that are used to detect pollutants in the environment (Adell et al., 2014). In marine environments, bioindicators can be used to detect contamination of heavy metals and pesticides, as well as contamination of fecal origin from bacteria, viruses, and parasites (Miller et al., 2005). One example of a marine bioindicator is the penguin, a flightless bird that has been used to detect mercury (Hg) levels in the Southern Ocean (Carravieri et al., 2013). Bivalves have been used as bioindicators in aquatic environments for decades (Miller et al., 2005). In the past, they have been frequently used as bioindicators of heavy metals and pesticides, but more recently have been used for detection of bacteria, viruses, and parasites (Miller et al., 2005). Mussels are successful bioindicators because they are filter feeders and can filter large volumes of water at a time in one location, allowing them to concentrate chemicals from their environment (Miller et al., 2005; O’Connor, 2002). Mussels are also of interest because oocysts can be detected in their tissues, even when they cannot be detected in the ocean water (Miller et al., 2005). Giardia lamblia is the most common intestinal protozoan parasite in the world. The estimated number of cases worldwide per year is 280 million (Baldursson and Karanis, 2011). G. lamblia ∗
Research mentored by Ghislaine Mayer, Ph.D.
126
The Manhattan Scientist, Series B, Volume 5 (2018)
Marcazzo
produces cysts that are released in the feces of the infected animal (Lalle et al., 2005). It is transmitted by the fecal oral route, and can be spread between humans, or from animal to human (Koehler et al., 2014). The symptoms of giardiasis, the disease caused by ingested G. lamblia oocytes, are diarrhea, colic, headaches, dehydration, malabsorption, and weight loss and/or wasting (Koehler et al., 2014). In many cases, giardiasis can cause an asymptomatic infection in immunocompetent individuals (Koehler et al., 2014). G. lamblia can infect not only water supplies, but swimming pools as well, and is commonly associated with child daycare centers (Baldursson et al., 2011). It is a universal problem because conventional water treatments, such as chlorine disinfection or filtration, do not successfully remove or inactivate infectious cysts, which are then able to remain active in drinking water and cause outbreaks (Betancourt et al., 2004). Consumption of infected bivalves or water puts a person at risk for contracting giardiasis (Feng and Xiao, 2011). While G. lamblia is known to infect fresh water, it can infect oceans through land or agricultural runoff, and sewage overflows (Graczyk et al., 2008). There are multiple known assemblages of G. lamblia which are associated with different organisms. There are eight known assemblages (A-H) of the β-Giardin gene, which is specific to G. lamblia (Heyworth, 2016). Humans are only known to be infected by assemblages A and B, which can also infect other mammals. Assemblages C-H are believed to be more host specific (Lalle et al., 2005). DNA of Giardia positive samples can determine the assemblage, and might give clues to the source of the infection (Heyworth, 2016). This is the fourth year the project has been carried out which allows for the comparison of data over time, providing a better picture of the occurrence of this parasite in the bivalves collected from the Bronx. In 2014, a Giardia prevalence of 4.5% was observed in Geukensia demissa collected from Orchard Beach (Tei et al., 2016). In 2015, Giardia was not observed in Geukensia demissa collected at Orchard Beach, while a sharp increase of 28% was noted in 2016 (Limonta and Dolce, unpublished observations). In 2016, Geukensia demissa was also collected at Clason Point (Soundview). The prevalence of Giardia lamblia from Geukensia demissa collected at Clason Point was 19% (Limonta and Dolce, unpublished observations). Therefore, a variation in the prevalence of Giardia was observed in the two locations. The goal of this study was to investigate the presence of the human intestinal parasite, Giardia lamblia in Ribbed Mussels (Geukensia demissa) collected from Bronx, NY in 2017, and to determine the change in prevalence between two locations, over time, between tissues, and by genotype.
Materials and Methods Bivalve collection A total of 69 samples of Geukensia demissa (ribbed mussels) were collected from Orchard Beach and a total of 61 samples were collected from Clason Point, both in Bronx, New York. The beaches are about 10 miles apart (Fig. 1). Three other species of bivalves were collected at each collection site, including Mytilus edulis (blue mussels), Mya arenaria (soft-shell clams), and Crassostrea virginica (oysters).
Marcazzo
The Manhattan Scientist, Series B, Volume 5 (2018)
127
Figure 1. Collection Sites (https://www.google.com/maps/dir/Orchard+Beach,+Bronx,+NY/Clason+Point,+Bronx, +NY/@40.8417037,-73.8593684,13z/data=!3m1!4b1!4m14!4m13!1m5!1m1!1s0x89c28c5fdfb5d393: 0xeb511a8331f79e2f!2m2!1d-73.7922928!2d40.8672934!1m5!1m1!1s0x89c2f42780f4b029: 0x5ad10f293eba68b6!2m2!1d-73.8564048!2d40.8159508!3e2)
Each sample was dissected to collect tissues, such as the digestive gland, mantle, gill, abductor muscle, foot, and hemolymph (Fig. 2). A total of 260 tissues from 69 samples of Geukensia demissa collected from Orchard Beach and a total of 258 tissues from 61 samples of Geukensia demissa collected from Clason Point were collected. DNA from each organ was extracted using the Qiagen Blood and Tissue Kit (Germantown, MD, USA).
Figure 2. Anatomy of Geukensia demissa
Giardia lamblia detection The DNA from each tissue was amplified using the Polymerase Chain Reaction (PCR) and previously developed primers specific to the β-Giardin gene, which is only found in Giardia lamblia. In the first step of PCR, the DNA was amplified with the Gia7 Forward Primer (5â&#x20AC;&#x2122;-AAGCCCGACGACCTCACCCGCAGTGC-30 ) and the Gia759 Reverse Primer (50 GAGGCCGCCCTGGATCTTCGAGACGAC-30 ). In the second step, the DNA was amplified with
128
The Manhattan Scientist, Series B, Volume 5 (2018)
Marcazzo
the Gia7 Nested Forward Primer (5â&#x20AC;&#x2122;-GAACGAACGAGATCGAGGTCCG-30 ) and the Gia759 Nested Reverse Primer (50 -CTCGACGAGCTT CGTGTT-30 ) (CaccioĚ&#x20AC;, S.M. et al., 2002). The positive control in the PCR was purified Giardia lamblia DNA, and the negative control was water. The amplicons were visualized by agarose gel electrophoresis.
Results The prevalence of G. lamblia in G. demissa collected in 2017 from marine environments in the Bronx was determined per organism and tissue. Samples of G. demissa from both Orchard Beach and Clason Point tested positive for G. lamblia DNA (Figs. 3 and 4). A total of 29/69 (42.0%)
Figure 3. Representative agar gel electrophoresis of positive G. lamblia sample from Orchard Beach. Lane 1: 100 bp marker; Lane 2: positive control (551 bp); Lanes 5, 7, 8, 10, 11, 12, 13, and 14: positive for G. lamblia; Lane 15: negative control
Figure 4. Representative agar gel electrophoresis of positive G. lamblia sample from Clason Point. Lane 1: 100 bp marker; Lane 2: positive control (551 bp); Lanes 9, 10, 12, 13, 15, 16, and 18: positive for G. lamblia; Lane 20: negative control
samples from Orchard Beach and 22/61 (36.1%) samples from Clason Point tested positive for G. lamblia DNA. A higher prevalence was found in Orchard Beach, suggesting more pollution of the water (Table 1). The prevalence of Giardia lamblia has increased in each location since 2016. We also determined the prevalence of G. lamblia DNA in the tissues of each organism, which varied between the two beaches. The specimens collected from Orchard Beach had a high prevalence
Marcazzo
The Manhattan Scientist, Series B, Volume 5 (2018)
129
of G. lamblia DNA in the hemolymph and abductor muscle (Table 1); in the specimens collected from Clason Point, the highest prevalences were in the digestive gland and mantle (Table 2). Table 1. Prevalence of G. lamblia in tissue from G. demissa collected from Orchard Beach, New York in 2017 Digestive Gland 15/49 (30.6%)
Mantle 14/42 (26.9%)
Abductor Muscle 18/51 (35.3%)
Foot
Gills
Hemolymph
7/29 (24.1%)
17/50 (34.0%)
6/13 (46.2%)
Table 2. Prevalence of G. lamblia in tissue from G. demissa collected from Clason Point, New York in 2017 Digestive Gland 11/50 (22.0%)
Mantle 12/50 (24.0%)
Abductor Muscle 7/52 (13.5%)
Foot
Gills
Hemolymph
9/50 (18.0%)
9/48 (18.8%)
Not Determined
Discussion G. lamblia DNA was detected in G. demissa collected from each location, and was detected in all tissues tested. The prevalence of G. lamblia was higher in Geukensia demissa collected at Orchard beach (42%) compared to Clason Point (36.1%) (Table 3). The prevalence of G. lamblia in bivalves collected from Orchard Beach and Clason Point in the Bronx in 2017 has increased significantly since previous years. The prevalence between locations has consistently been different. In 2017, The prevalence was 42% at Orchard beach, and 36.1% at Clason Point. The difference in prevalence could be a cause of different levels of sewage or land runoff contaminating each area of the ocean. Table 3. Prevalence of G. lamblia in G. demissa collected from Orchard Beach and Clason Point in New York in 2017 Orchard Beach
Clason Point
29/69 (42.0%)
22/61 (36.1%)
The highest prevalence per tissue in 2014 at Orchard Beach was in the gills (4.5%), while in 2017 the highest prevalence per tissue was found in the abductor muscle (35.3%) and hemolymph (34%) (Table 4) (Tei et al., 2016). The abductor muscle and hemolymph were not tested in 2014, however the digestive gland and mantle were tested and were found to have a 0% prevalence of G. lamblia in those tissues (Tei et al., 2016). This is very different from the prevalence determined in 2017, where the digestive gland had a high prevalence of 30.6%, and the mantle had a prevalence of 26.9%. The prevalence of G. lamblia in G. demissa has been found to be higher at Orchard Beach than Clason Point during the past two years of the study. In 2014, the prevalence was observed
130
The Manhattan Scientist, Series B, Volume 5 (2018)
Marcazzo
Table 4. Prevalence of G. lamblia in tissue from G. demissa collected from Orchard Beach, NY in 2016 and 2017 Year
Digestive Gland
Mantle
2014
0
0
2017
15/49 (30.6%)
14/42 (26.9%)
Abductor Muscle not determined 18/51 (35.3%)
Foot
Gills
Hemolymph
1/44 (2.2%)
2/44 (4.5%)
7/29 (24.1%)
17/50 (34.0%)
not determined 6/13 (46.2%)
to be 4.5% at Orchard beach and in 2015, no Giardia lamblia DNA was detected in the samples collected from Orchard Beach, however no samples from Clason Point were collected in 2014 and 2015. In 2016, it was observed to be 28% at Orchard beach, and 19% at Clason Point (Table 5) (Limonta and Dolce, unpublished observations). The prevalence at Orchard beach has increased Table 5. Prevalence of G. lamblia in G. demissa collected from Orchard Beach and Clason Point, NY in 2014, 2015, 2016, and 2017 Year
Orchard Beach
Clason Point
2014 2015 2016 2017
4.5% 0% 28% 29/69 (42.0%)
not determined not determined 19% 22/61 (36.1%)
each year, except for 2015 when there was no detection of Giardia lamblia DNA. Between the first and second years of the study (2014-2015), the prevalence decreased from 4.5% to 0% at Orchard Beach. Between the second and third years of the study (2015-2016), the prevalence increased by 28%, and by 14% in between the third and fourth years (2016-2017) at Orchard Beach. At Clason Point, the prevalence increased by 17.1% between the third and fourth years (2016-2017). The presence of G. lamblia in G. demissa in Bronx beaches is likely due to oocysts in sewage that ends up in the ocean. In one study done by Graczyk et al. (2008), G. lamblia oocysts were found in sludge from two wastewater treatment plants in Poland. This sewage sludge is typically spread on agricultural lands as a way to fertilize the land with nitrogen and phosphorus, however rainfall can cause the sludge to spread into nearby water sources. This poses a threat to animals and humans who can be infected by the oocysts, especially because conventional sewage treatments cannot inactivate oocysts. (Graczyk et al., 2008). There are many wastewater treatment plants and pumping stations surrounding the waters of Orchard Beach and Clason Point, which could be producing sludge that is able to contaminate the waters (Fig. 5).
Conclusions A higher prevalence of G. lamblia in G. demissa collected at Orchard beach and Clason Point has been observed over the past four years. This is important to public health because people swim
Marcazzo
The Manhattan Scientist, Series B, Volume 5 (2018)
131
Figure 5. Map of wastewater plants in the surrounding areas of Orchard Beach and Clason Point, Bronx, NY (https://www.google.com/maps/search/sewage+treatment+plant+near+me/@40.8405392,-73.883416,12z)
at these beaches during the summer, and could potentially ingest G. lamblia cysts and become infected. In the future, we plan to sequence positive samples in order to asses the genotype of G. lamblia which will provide us with the information to determine if the infection came from human or animal feces. Sequencing will also allow us to compare the assemblages found over the years, and determine if the prevalences have changed. We also plan to amplify positive samples with Quantitative Real Time PCR, in order to measure the quantity of G. lamblia DNA in each sample.
Acknowledgments This work was financially supported by the Linda and Dennis Fenton ’73 endowed biology research fund. The author would like to thank Dr. Ghislaine Mayer for giving her this research opportunity.
References Adell, A. D., Smith, W. A., Shapiro, K., Melli, A., and Conrad, P. A. (2014). Molecular Epidemiology of Cryptosporidium spp. and Giardia spp. in Mussels (Mytilus californianus) and California Sea Lions (Zalophus californianus) from Central California. Applied and Environmental Microbiology, 80(24), 7732–7740. http://doi.org/10.1128/AEM.02922-14 Baldursson, S. and Karanis, P. (2011). Waterborne transmission of protozoan parasites: Review of worldwide outbreaks – An update 2004–2010. Water Research, 45(20), 6603-6614. https: //doi.org/10.1016/j.watres.2011.10.013
132
The Manhattan Scientist, Series B, Volume 5 (2018)
Marcazzo
Betancourt, W. Q. and Rose, J. B. (2004). Drinking water treatment processes for removal of Cryptosporidium and Giardia. Veterinary Parasitology, 126(1-2), 219-234. https://doi.org/10.1016/j.vetpar.2004.09.002 Cacciò, S.M., De Giacomo, M., and Pozio, E (2002). Sequence analysis of the β-giardin polymorphism assay to genotype Giardia duodenalis cysts from human faecal samples. International Journal of Parasitology, 32, 1023-1030. https://doi.org/10.1016/S0020-7519(02)00068-1 Carravieri, A., Bustamante, P., Churlaud, C., and Cheral, Y. (2013). Penguins as bioindicators of mercury contamination in the Southern Ocean: Birds from the Kerguelen Islands as a case study. Science of the Total Environment, 454-455, 141-148. https://doi.org /10.1016/j.scitotenv.2013.02.060 Feng, Y., and Xiao, L. (2011). Zoonotic Potential and Molecular Epidemiology of Giardia Species and Giardiasis. Clinical Microbiology Reviews, 24(1), 110-140. http://doi.org/10.1128 /CMR.00033-10 Graczyk, T. K., Kacprzak, M., Neczaj, E., Tamand, L., Graczyk, H., Lucy, F. E., and Girouard, A. S. (2008). Occurrence of Cryptosporidium and Giardia in sewage sludge and solid waste landfill leachate and quantitative comparative analysis of sanitization treatments on pathogen inactivation. Environmental Research, 106(1), 27-33. https://doi.org/10.1016/j.envres.2007.05.005 Heyworth, M. F. (2016). Giardia duodenalis genetic assemblages and hosts. Parasite, 23, 13. http://doi.org/10.1051/parasite/2016013 Koehler, A. V., Jex, A. R., Haydon, S. R., Stevens, M. A., and Gasser, R. B. (2014). Giardia/giardiasis – A perspective on diagnostic and analytical tools. Biotechnology Advances, 32(2), 280-289. https://doi.org/10.1016/j.biotechadv.2013.10.009 Lalle, M., Pozio, E., Capelli, G., Bruschi, F., Crotti, D., Simone, M., and Caccio, S.M (2005). Genetic heterogeneity at the β-giardin locus among human and animal isolates of Giardia duodenalis and identification of potentially zoonotic subgenotypes. International Journal of Parasitology, 35, 207–213. Doi: 10.1016/j.ijpara.2004.10.022 Miller, W. A., Atwill, E. R., Gardner, I. A., Miller, M. A., Fritz, H. M., Hedrick, R. P., Melli, A. C., Barnes, N. M., and Conrad, P. A. (2005). Clams (Corbicula fluminea) as bioindicators of fecal contamination with Cryptosporidium and Giardia spp. In freshwater ecosystems in California. International Journal for Parasitology, 35(6), 673-684. https://doi.org/10.1016 /j.ijpara.2005.01.002 O’Connor, T. P. (2002). National distribution of chemical concentrations in mussels and oysters in the USA. Marine Environmental Research, 53(2), 117-143. https://doi.org/10.1016/S01411136(01)00116-7 Tei, F. F., Kowalyk, S., Reid, J. A., Presta, M. A., Yesudas, R., and Mayer, D. C. G. (2016). Assessment and Molecular Characterization of Human Intestinal Parasites in Bivalves from Orchard Beach, NY, USA. International Journal of Environmental Research and Public Health, 13(4), 381. http://doi.org/10.3390/ijerph13040381
Bark and spine analyses of Neobuxbaumia mezcalaensis and Pachycereus hollianus Catherine McDonough∗ Laboratory of Plant Morphogenesis, Department of Biology, Manhattan College Abstract. The purpose of this study was to investigate bark coverage patterns in Mexican cactus species and to compare said patterns to the number of spines of each species. Neobuxbaumia mezcalaensis and Pachycereus hollianus are two species of columnar cactus plants found in the Tehuacan-Cuicatlan Biosphere Reserve, San Juan Raya (18◦ N 97◦ W), Puebla, Mexico, both of which experience bark coverage. In saguaro cactus plants (Carnegiea gigantea), this bark coverage leads to premature cactus death (Evans et al. 2005). In both cactus species, there is more bark found on south-facing surfaces than all other surfaces (Bark percentages: N. mezcalaensis – 93◦ S, 88◦ E, 77◦ N, and 66◦ W; P. hollianus – 95◦ S, 76◦ E, 55◦ N, and 39◦ W). For N. mezcalaensis, young cacti have 10 spines for each areole. Cacti taller than 4 m with little to no bark have 7 spines per areole. Once a cactus surface has 25% bark coverage, only 3 spines remain. For P. hollianus, young cacti have 17 spines for each areole. Cacti taller than 4 m with little to no bark have 16 spines per areole. Cacti with high percentages of bark retain more than 14 spines per areole. Differences in spine retention will be subject to further analysis when additional species are analyzed.
Introduction
Cactus plants can be found throughout the Americas, but the greatest diversity of cacti occurs in Mexico (Anderson, 2001). This study occurred in Central Mexico (18◦ N latitude) because there was a large population of tall, long-lived columnar cactus species (Anderson, 2001) among this diversity. In addition to the tall columnar cactus species, there was a large variety of smaller shrub-like cactus plants, such as Opuntia species (Fig. 1), creating a dense thorn forest. The two columnar cacti in this study are Neobuxbaumia mezcalaensis (Fig. 2) and Pachycereus hollinaus (Fig. 3), both of which experience bark coverage. These species are endemic to Tehuacan-Cuicatlan Biosphere Reserve, San Juan Raya (18◦ N 97◦ W), Puebla, Mexico, home to a variety of other columnar cacti as well. P. hollianus is between 4 and 5 m in height and has a diameter of 5 to 7 cm (Anderson 2001). It is usually not branched, but if it is, it will be at the very base of the cactus plant. It is typically used as a hedge plant in Mexico due to its high population density and thick spines. N. mezcalaensis has a much larger diameter (13 to 40 cm) and height (5 to 10 m) than P. hollianus,and its population is more scattered than P. hollianus. Cactus plants normally do not form a bark covering, and thus, their stems remain green. However, many species of columnar cacti have shown to produce bark known as epidermal barking (Evans, 2005; Evans and Macri, 2008; Evans et al. 1994a; 1994b; 1994c). These bark coverages have been shown to lead to the premature death of cactus plants (Evans et al., 2005). Cactus bark is caused by proliferation of epidermal cells, making it different than most plant species (Gibson and Nobel 1986). It begins when epicuticular waxes accumulate and block the stomata. Once stomata are covered, gas exchange is limited, and internal tissues begin to die. ∗
Research mentored by Lance Evans, Ph.D.
134
The Manhattan Scientist, Series B, Volume 5 (2018)
21
McDonough
Fig. 1
Figure 1. Photographic image of several cactus species endemic to Tehuacan-Cuicatlan Biosphere Reserve, San Juan Raya (18â&#x2014;Ś N 97â&#x2014;Ś W), Puebla, Mexico.
23
Fig. 3 22
A
B
Fig. 2 A
B
Figure 2. A: Image of a cactus surface showing crests and troughs. B: Top-view diagram of a columnar cactus where one crest and two troughs are indicated. Note that crests are concave and troughs are convex, so that crests should receive more incident sunlight than troughs. In other words, crests may provide shade for troughs.
Figure 3. A: Image of several cacti of Neobuxbaumia mezcalaensis at Tehuacan-Cuicatlan Biosphere Reserve. B: Image of a surface of N. mezcalaensis showing crests and troughs.
Previous studies have shown that this cactus barking is due to sunlight exposure (Evans et al., 1994b; Evans and Clooney, 2015). For cactus plants in the Northern hemisphere, the southfacing surfaces, on average, undergo higher sunlight exposure than the other surfaces (Geller and Nobel, 1984). Therefore, south-facing surfaces will bark faster than the north-facing surfaces. The north-facing surface is typically the last to bark before cactus death (Evans et al., 2001, Fig. 4). Additionally, columnar cacti are ribbed, giving it indentations and protrusions. Two flat surfaces,
McDonough
The Manhattan Scientist, Series B, Volume 5 (2018)
135
known as troughs, meet at a point, protruding out of the cactus, known as the crest (Fig. 5). Crests will, thus, shade their troughs, protecting them from sunlight exposure. 24 Fig. 4
Figure 4. Diagram of the idea that bark occurs first on south-facing surfaces and progresses to the northfacing surfaces.
25
Fig. 5 A
B
Figure 5. A: Image of several cacti of Pachycereus hollianus at Tehuacan-Cuicatlan Biosphere Reserve. B: Image of a surface of P. hollianus showing crests and troughs.
Cactus bark does not only affect photosynthetic processes but also spine counts. Previous studies have also shown that as crest barking increases, the number of spines found on the crest decreases (Evans and Lâ&#x20AC;&#x2122;Abbate, 2018). Spines come in a variety of colors and shapes. As a cactus plant ages, the color and number of spines can vary. Anderson (2001) writes about three types of spines in each species: apical spines, central spines, and radial spines. N. mezcalaensis ideally has 2 apical, 1 central, and 7 radial spines (Fig. 6A). The spines are an off-white with darker tips. In young cactus plants, P. hollianus has 2 apical, 3 central, and 12 radial spines, which are white and swollen at a darker gray base (Fig. 7A). The first hypothesis for this study is that there will be more bark coverage on the south-facing surfaces than the other surfaces. The second hypothesis is that as bark increases, the number of spines will decrease.
136
The Manhattan Scientist, Series B, Volume 5 (2018) A
B
D
C
26
McDonough
E
Figure 6. A: Image of a surface of a young (stem less than 2.5 m) Neobuxbaumia mezcalaensis that shows a full complement of spines, a dark color, and no bark. B through E: Four images of one cactus that shows differences in bark coverage on surfaces. B: Image shows the south-facing crest and trough. Note the extensive bark on the crests with bark on troughs. C: Image shows the east-facing crest and troughs. Note the bark on the crests and troughs. D: Image shows the north-facing crest and troughs. Note less bark on the crests and troughs. E: Image shows the 27 west-facing crest and troughs. Note less bark on the crests and troughs. Fig. 7 A
B
C
D
E
Figure 7. A: Image of a surface of a young (stem less than 2.5 m) Pachycereus hollianus that shows a full complement of spines, a green color, and little bark. B through E: Four images of one cactus that shows differences in bark coverage on surfaces. B: Image shows the south-facing crest and trough. Note the extensive bark on the crests with bark on troughs. C: Image shows the east-facing crest and troughs. Note the bark on the crests and troughs. D: Image shows the north-facing crest and troughs. Note less bark on the crests and troughs. E: Image shows the west-facing crest and troughs. Note less bark on the crests and troughs.
Materials and Methods The cacti studied are found in Tehuacan-Cuicatlan Biosphere Reserve, San Juan Raya (18â&#x2014;Ś N 97â&#x2014;Ś W), Puebla, Mexico. This region is in a large valley that normally has precipitation only during the summer period. The region has numerous species of columnar cacti and a wide variety of other plants and animals. The Biosphere Reserve is a restricted area controlled by local people
McDonough
The Manhattan Scientist, Series B, Volume 5 (2018)
137
so that the plants are protected and not subject to vandals. Evaluations of two species of columnar cacti (Neobuxbaumia mezcalaensis and Pachycereus hollinaus occurred in May 2018. These two cactus species are endemic to this region. Data was collected for 105 N. mezcalaensis cactus plants and 77 P. hollinaus cactus plants. All cacti sampled for this study were at least 4 m in height. In addition, cactus plants were avoided that had surrounding vegetation including other cactus plants because the aim of the study was to view bark accumulation on surfaces. From previous results, bark has shown to be related to the amount of sunlight exposure, so cactus plants with surrounding vegetation were excluded from consideration. Photographs and data were collected from each cactus at 1.7 m above ground level for 12 surfaces. Crests close to south, east, north, and west were selected for data selection and photographs. In addition, the right and left troughs for these crests were also evaluated. Bark percentages on these 12 surfaces were estimated visually. Photographs of all surfaces were archived. Additionally, measurements were taken in the field to measure the angle at which crests protrude from the inner circumference of the cactus. The angle at which the crests protruded out was later calculated based on trigonometric equations. The cactus plants of this study were put into four classes, I, II, III, and IV, based upon the percentage of bark coverage on the south-facing crests. The classes were: Class I (0-24% bark coverage), Class II (25-49% bark coverage), Class III (50-74% bark coverage), and Class IV (75%100% bark coverage). All four surfaces of one cactus go into one of these classes. Statistical analyses were performed on the bark coverage percentages for the four cactus classes. For each cactus class within each species, mean values of bark coverage were determined for all 12 surfaces. In addition, pairwise, T-tests (Snedecor and Cochran, 1967) were performed for each surface to determine statistical differences among surfaces and between classes. Based upon previous descriptions of these cactus species (Anderson, 2001), photographs on south, east, north, and west crests were evaluated for each cactus. Anderson (2001) described central spines, apical spines, and radial spines for both of these species. The numbers of spines in these three categories were evaluated for each surface using the photographs taken. The four cactus classes described above were not used to evaluate numbers of spines on surfaces. Instead, the amount of bark coverage on each individual crest was used to determine four new classes, A, B, C, and D, which refer to bark coverage on individual surface independent of which cactus they came from. The four classes were: Class A (0-24% bark coverage), Class B (2549% bark coverage), Class C (50-74% bark coverage), and Class D (75%-100% bark coverage). Statistical analyses were performed for the numbers of spines in each group for the four classes above. Mean numbers of individual spine groups and total spines were determined for each crest. In addition, pairwise, T-tests (Snedecor and Cochran, 1967) were performed for each surface to determine statistical differences among surfaces and between classes.
138
The Manhattan Scientist, Series B, Volume 5 (2018)
McDonough
Results The first hypothesis for this study was that cactus plants have more bark coverage on the south-facing surfaces than on other surfaces. Data in Table 1 show that for Class I and II cacti of N. mezcalaensis similar bark coverages were present on most crests. However, Class III and IV cacti had more bark on south-facing crests than on other surfaces. E.g., south crests of Class III and IV cacti had 66% and 93% bark coverage, respectively. In contrast, north crests of Class III and IV cacti had 57% and 77% bark coverage, respectively. Moreover, for Class III and IV cacti, bark coverage on troughs was about one-third to one-half that of their respective crests. Table 1. Relationship between cactus surface bark coverages of Neobuxbaumia mezcalaensis Class Number of (Coverage) Samples1 I (0-24%) 38 II (25-49%) 5 III (50-74%) 9 IV (75-100%) 49
South South Right Troughs Crests 10 10 33 8 25B 66A,2 93a,3,A 34B,I,4
East Crests 10 30 59A 88a,A
East Right Troughs 6 7 16B 26B,I
North Crests 9 60 57A 77b,A
North Right Troughs 7 6 17B 21B,II
West Crests 10 34 40 66c,A
West Right Troughs 8 8 17 20B,III
1
The number of cactus plants in each class. This uppercase superscript “A” refers to the results of a T-test analysis of crest compared to its own trough. The uppercase letter superscript on the corresponding trough represents whether or not the crest is statistically different than the trough. This applies to all the other crest and trough relationships. 3 The lowercase superscript “a” refers to the results of a T-test analysis of a crest compared to the other crests of the cacti in the class. The lowercase letter superscripts on the subsequent crests in the class represent whether or not that crest is statistically different than the other crests. 4 The Roman numeral superscript “I” refers to the results of a T-test analysis of troughs compared to other troughs of the cacti in the class. The Roman numeral superscripts on the subsequent troughs in the class represent whether or not that crest is statistically different than the other troughs. 2
The second hypothesis was that surfaces with more bark coverage have fewer spines. Young cactus plants (less than 3 m height) of N. mezcalaensis had 2 apical spines, 1 central spine, and 7 radial spines (Fig. 6). Cacti in all classes had less than one apical spine and one central spine. Class A cacti had about 6 radial spines, while cacti in Classes B through D had about 3 radial spines (Table 2). When all spines were added, cacti in Classes B through D had about half of the spines of those in Class A (Fig. 8). Classes B through D were also found to be statistically different than Class A cacti. Overall, bark coverages in P. hollianus were similar to bark coverages for N. mezcalaensis. Similar to N. mezcalaensis, there were no statistical differences among crests or troughs for Classes I and II cacti for P. hollianus (Table 3). However, Classes III and IV cacti had more bark on southfacing crests than on other surfaces (Fig. 7). For example, south crests of Classes III and IV cacti had 62% and 95% bark coverage, respectively. In contrast, north crests of Classes III and IV cacti had 44% and 55% bark coverage, respectively. For Classes III and IV cacti, bark coverages on troughs were about 80% of the coverages on their respective crests.
McDonough
The Manhattan Scientist, Series B, Volume 5 (2018)
139
Table 2. Relationship between bark coverage and the presence of spines on Surfaces of Neobuxbaumia mezcalaensis Number of Samples1
Class (Coverage) A (0-24%) B (25-49%) C (50-74%) D (75-100%)
Apical Spines
Central Spines
a,2
182 22 48 157
Radial Spines
a
0.60 0.27a 0.26a,b 0.17b
Total Spines
a
0.82 0.50a 0.22a,b 0.18b
7.31a 3.86b 3.60b 3.27b
5.92 3.09b 3.17b 2.94b
1
The number of cactus surfaces on which spines were counted and analyzed. The lowercase letter superscripts refer to the T-Test analysis28 of spines between classes. Each letter represents whether or not 29 the subsequent class spines are statistically different than the previous classes. Fig. 8. N. mezcalaensis 2
Fig. 9. P. hollianus
8
Total number of spines
Number of spines
a A 6
4
b B
bC
bD
2 0
20
40 60 Bark coverage (%)
80
100
16
a A
aC
15 a B
bD
14 0
Figure 8. Relationship between number of spines and bark coverage for Neobuxbaumia me-zcalaensis determined by Classes A through D. The equation of the line is y = 13.5x−0.32 , r2 = 0.95 . The value for Class A was statistically different from the values of Classes B through D.
20
40 60 Bark coverage (%)
80
100
Figure 9. Relationship between number of spines and bark coverage for Pachycereus hollianus determined by Classes A through D. The equation of the line is y = 17.2x−0.04 , r2 = 0.84. The values for Class A through C were statistically different from the value of Class D.
Table 3. Relationship between cactus surface bark coverages of Pachycereus hollianus Class (Coverage) I (0-24%) II (25-49%) III (50-74%) IV (75-100%) 1
Number of Samples1
South Crests
32 3 9 30
8 33
62a,2,A,3 95a,A
South Right Troughs
East Crests
East Right Troughs
North Crests
North Right Troughs
West Crests
West Right Troughs
15 35 37B,I,4 82B,I
20 55 53a,b 76b
17 48 53I 64II
15 25 44a,b 55c
10 15 28I,II 43III
9 15 27b 39c
6 13 18II 34III
The number of cactus plants in each class. This uppercase superscript “A” refers to the results of a T-Test analysis of crest compared to its own trough. The uppercase letter superscript on the corresponding trough represents whether or not the crest is statistically different than the trough. This applies to all the other crest and trough relationships. 3 The lowercase superscript “a” refers to the results of a T-Test analysis of a crest compared to the other crests of the cacti in the class. The lowercase letter superscripts on the subsequent crests in the class represent whether or not that crest is statistically different than the other crests. 4 The Roman numeral superscript “I” refers to the results of a T-Test analysis of troughs compared to other troughs of the cacti in the class. The Roman numeral superscripts on the subsequent troughs in the class represent whether or not that crest is statistically different than the other troughs. 2
140
The Manhattan Scientist, Series B, Volume 5 (2018)
McDonough
Young cacti (less than 3 m height) of P. hollianus had 2 apical spines, 3 central spines, and 12 radial spines (Fig. 7). Cacti in all classes had less than two apical spines and three central spines. Class A cacti had about 11 radial spines, while cacti in Classes B through D had about 10 radial spines (Table 4). Classes A through C cacti were found to be statistically different than Class D cacti (Fig. 9). Overall, cacti of P. hollianus lost few spines with high bark coverages. Table 4. Relationship between bark coverage and the presence of spines on surfaces of Pachycereus hollianus Class (Coverage) A (0-24%) B (25-49%) C (50-74%) D (75-100%)
Number of Samples1
Apical Spines
Central Spines
Radial Spines
Total Spines
158 20 28 98
1.89 1.82 1.77 1.82
2.92 2.91 2.80 2.90
11.1 9.95 10.2 9.49
15.9a2 14.8a 15.0a 14.2b
1
The number of cactus surfaces on which spines were counted and analyzed. The lowercase letter superscripts refer to the T-Test analysis of spines between classes. Each letter represents whether or not the subsequent class spines are statistically different than the previous classes. 2
Discussion The purpose of this study was to investigate bark coverage patterns in two cactus species from Mexico. When a cactus has bark coverage, gas exchange ceases, preventing photosynthetic and respiratory processes. Eventually, internal tissues decay, and the cells die. For long-lived columnar cacti can be found throughout the Americas, and twenty-one species of these cacti are experiencing epidermal browning (Anderson, 2001; Evans, 2005; Evans and Macri, 2008; Evans et al., 1994a; 1994b; 1994c). These 21 species are more vulnerable to premature death (Evans et al., 2005). Self-shading is a measure of the ability of the crests to shade trough surfaces. Estimates of self-shading were determined by dividing the depth of troughs divided by the distance between crests (Fig. 10). The crest to crest distance for N. mezcalaensis was 60 mm while the trough depths 30
Fig.10
Figure 10. Diagram for the calculation of self-shading of troughs by crests. “C” denotes a crest and “T” a trough. “a” denotes the crest to crest distance, while “b” is the trough depth. Self-shading was calculated as ab .
were 40 mm, so the self-shading ratio was calculated to be 66%. The crest to crest distance for P. hollianus was 36 mm while the trough depths were 9 mm, so the self-shading ratio was calculated to be 25%. Clearly, the ratio of 66% for N. mezcalaensis would produce more self-shading than the 25% value for P. hollianus. The ratios of bark coverage on troughs and crests for the four surfaces Class IV cacti of N. mezcalaensis were 37% (S), 30% (E), 27% (N), and 30% (W) to provide a
McDonough
The Manhattan Scientist, Series B, Volume 5 (2018)
141
mean of 31% (Table 1). The ratios of bark coverage on troughs and crests for the four surfaces Class IV cacti of P. hollianus were 86% (S), 84% (E), 78% (N), and 87% (W) to provide a mean of 84% (Table 3). Therefore, the low trough to crest bark ratio of 31% for N. mezcalaensis is coincident with the high self-shading ratio of 66%. In contrast, the high trough to crest bark ratio of 84% for P. hollianus is coincident with the low self-shading ratio of 25%. P. hollianus is a much flatter cactus, meaning the crests do not protrude very far from the inner circumference. Therefore, the bark coverage between crests and troughs is much closer. Data from a previous study was used to calculate trough to crest ratios for saguaro cacti (Carnegiea gigantea) (DeBonis et al., 2017). For the previous study, there were three sampling periods (1994, 2002, and 2010), in which trough to crest ratios for the four surfaces were 61 (S), 38 (E), 48 (N), and 72 (W). When the above four ratios were compared with those of N. mezcalaensis [37 (S), 30 (E), 37 (N), and 30 (W)], with paired T-tests, all comparisons were statistically significant (< 0.01). In Tucson Mountain Park, where these plants are found, the latitude is 32â&#x2014;Ś , meaning the southfacing surface receives four times as much sunlight on the south-facing surfaces than the northfacing surfaces (Geller 1986). Therefore, stark differences are seen between the north and southfacing surfaces, giving the north-facing surfaces a much lower trough to crest bark ratio. Spines are a structure exclusive to cacti. Spines are simply modified leaves, hardened by lignin (Gibson and Nobel, 1986). They are costly in energy, energy that can be put towards photosynthetic or reproductive processes. The most obvious function of spines is for defense against herbivores, but evidence also suggests that spines may regulate temperature, limit transpiration, and protect from ultraviolet radiation (Gibson and Nobel, 1986). If a cactus loses its spines, it will become vulnerable to predators and even the environment. As previous data shows, the number of spines of N. mezcalaensis does lower as the crest bark coverage increases. Once a cactus reaches 25% bark on the crest, nearly half of all spines are lost. This is not seen in P. hollianus, however. One to two spines are lost between Class A and D, showing that bark coverage does not seriously affect the number of spines in P. hollianus. A previous study did a similar analysis of spines for saguaro cactus plants (Carnegiea gigantea), comparing north and south bark coverages to the number of spines (Evans and Lâ&#x20AC;&#x2122;Abbate, 2018). Only living cactus plants with paired photographs were analyzed, leaving a data set of 461 cactus plants. Means for number of spines and spine quality were taken for the north and south facing crests. The mean bark coverage for the north and south crests 19.2% and 62.0%, respectively. The mean total spines for north-facing surfaces was 7.75 spines while that of south-facing surfaces was 4.21. Despite being about 1500 miles apart, these N. mezcalaensis and C. gigantea show similarities in their spines. Future research will be investigating the other species in Mexico, previously mentioned. Data and photographs were also taken from Neobuxbaumia macrocephala, Neobuxbaumia tetetzo, and Cephalocereus columna-trajani in May 2018 in the Tehuacan-Cuicatlan Biosphere Reserve. Bark
142
The Manhattan Scientist, Series B, Volume 5 (2018)
McDonough
data and spines will be analyzed extensively to investigate the effect of sun exposure on the species. Additionally, the spine and bark data collected from all species studied in Mexico will be compared to that of saguaro cacti (Carnegiea gigantea).
Acknowledgements
The author is indebted to the Catherine and Robert Fenton Endowed Chair in Biology to Lance S. Evans for financial support of this research.
References Anderson, E. F. 2001. The Cactus Family. Timber Press. Portland, OR. De Bonis, M. L. Barton and L.S. Evans. 2017. Rates of bark formation on surfaces of saguaro cacti (Carnegiea gigantea). J. Torrey Bot. Soc. 144: 450-458. Evans, L.S., V.A. Cantarella, K.W. Stolte and K.H. Thompson. 1994a. Phenological changes associated with epidermal browning of saguaro cacti at Saguaro National Monument. Environ. Exp. Bot. 34: 9-17. Evans, L. S., V.A. Cantarella, L. Kaszczak, S.M. Krempasky, and K.H. Thompson. 1994b. Epidermal browning of saguaro cacti (Carnegiea gigantea). Physiological effects, rates of browning and relation to sun/shade conditions. Environ. Exp. Bot. 34: 107-115. Evans, L. S., and Cooney, M. L. 2015. Sunlight-induced bark formation in long-lived South American columnar cacti. Flora - Morphology, Distribution, Functional Ecology of Plants, 217, 33-40. doi:10.1016/j.flora.2015.09.012 Evans, L. S. and R. Lâ&#x20AC;&#x2122;Abbate. 2018. Areole changes during bark formation on saguaro cacti. Haseltonia. 24: 55-62. Evans, L. S. and A. Macri. 2008. Stem surface injuries of several species of columnar cacti of Ecuador. J. Torrey Bot. Soc. 135:475-482. Evans, L. S., C. McKenna, R. Ginocchino, G. Montenegro, and Roberto Kiesling. 1994c. Surficial injuries to several cacti of South America. Environ. Exp. Bot. 35: 105-117. Evans L. S., J. H. Sullivan, and M. Lim. 2001. Initial effects of UV-B radiation on stem surfaces of Stenocereusthurberi (Organ Pipe Cacti). Environ. Exp. Bot. 46: 181-87. Evans, L. S. 2005. Stem surface injuries to Neobuxbaumia tetetzo and Neobuxbaumia mezcalaensis of the Tehuacan Valley of Central Mexico. J. Torrey Bot. Soc. 132: 33-37. Evans, L. S., A. J. Young, and J. Harnett. 2005. Changes in scale and bark stem surface injuries and mortality rates of a saguaro (Carnegiea gigantea) cacti population in Tucson Mountain Park. Can. J. Bot. 83: 311-319. Geller, G. and P. Nobel. 1986. Cactus ribs: influence of PAR inception of CO2 uptake. Photosynthetica. 18: 482-494. Gisbon, A. C. and P. S. Nobel. 1986. The Cactus Primer. Harvard Univ. Press. Cambridge, MA. Snedecor, G.W. and W.G. Cochran. 1967. Statistical Methods, Sixth Edition. The Iowa State University Press. Ames.
Predicting bark coverage on saguaro cacti (Carnegiea gigantea) Olivia Printyâ&#x2C6;&#x2014; Laboratory of Plant Morphogenesis, Department of Biology, Manhattan College Abstract. More than 20 species of tall, columnar cactus species in the Americas show bark coverage on their surfaces. Saguaro cacti (Carnegiea gigantea) show mortality rates of about 2.3% per year over the period from 1980 to the present. This mortality is due to bark coverage on cactus surfaces. A saguaro population has been studied from 1994 to 2017. When bark coverage on north-facing right troughs of saguaro cactus reach 80% coverage, there is a 95% probability that these cacti will die within 8 years. The purpose of this study was to use bark coverage on several predictor surfaces to predict bark coverage on north-right troughs on surfaces of saguaro cacti (Carnegiea gigantea) and to determine whether bark coverages on north-right troughs will predict eventual cactus death. Data from 12 surfaces on 1149 cacti and four sampling periods were used to give over 55,000 data points. For this analysis, machine learning programs DEC Trees and Validate Model were used to employ the accuracy of using three surface comparisons to predict bark coverage on the north-right troughs. The three sets of surfaces used were the north-left trough and the west-left trough, the north-left trough and east-right troughs, and the west-right trough and south-right trough. The first step used Validate Model to separate cacti based upon the rate of bark coverage on north-right troughs; cactus coverages on north-left troughs and west-left troughs were used as predictor surfaces. The input of the data from Validate Model into DEC Trees produces normal, slow, and fast bark coverages predicted with an accuracy 97.1%. DEC Trees gives an accuracy of 97% or greater for all predictions using the north-right trough and west-left trough as predictive surfaces. Overall, the data show that cacti with low rates of bark coverage were deemed slow based on the rate of bark coverage on north-right troughs only. Moreover, the data show that cacti with high rates of bark coverage were deemed fast based on the rate of bark coverage on north-right troughs only. The other 11 surfaces were of much less significance in determining normal, slow and fast cacti.
Introduction Saguaro cacti (Carnegiea gigantea) are a cacti species found in southern Arizona and northern Mexico (Anderson, 2001). They are tall, reaching up to 10 m in height, and have long life spans that can reach up to 300 years (Steenbergh and Lowe, 1977). The cacti are exposed to high levels of sunlight. This exposure results in bark formation, or epidermal browning. An increase in epicuticular waxes will begin in the stomata of the cactus, limiting the level of gas exchange (Evans et al., 2001). This will cause the surface of the cactus to turn brown and bark. The reduction of gas exchange limits the ability of the cactus to carry out photosynthesis and respiration and thus limits the growth of the cactus. The inability to carry out gas exchange will eventually lead to cactus death (Evans et al., 2005, 2013, Evans and DeBonis, 2015; DeBonis et al., 2017). Southern-facing surfaces are the first to experience epidermal browning. This process continues along the western and eastern surfaces of the cactus over time until it reaches the northernfacing surfaces, which are the last to form bark. Once a cactus reaches more than 80% barking on the north-right trough, it will be dead in 8 years (Evans et al., 2013). The purpose of this study is to understand rates of bark coverage on a variety of surfaces that will predict bark coverage on north-right troughs, which will in turn predict cactus death. For this â&#x2C6;&#x2014;
Research mentored by Lance Evans, Ph.D.
144
The Manhattan Scientist, Series B, Volume 5 (2018)
Printy
analysis, machine learning programs will be used for the predictions of how well surfaces predict 15 as well as how they determine rates of bark coverage on north-right troughs. Specifically, the data will be partitioned into rates of coverage that are normal (average), slow, and fast. Fig. 1.
Figure 1. The stages of bark formation on saguaro cacti. A cactus with no barking is shown with the trough and crest labelled. Crests protrude from the cactus and are generally exposed to more sunlight than troughs, which are angled. A cactus with more extensive bark coverage is also shown, as is a cactus that is completely covered in bark. Note that on this cactus, there are no spines located on the crest of the cactus.
Materials and Methods Field conditions Saguaro cacti (Carnegiea gigantea) over a 23-year time period were analyzed. 50 permanent plots containing 1149 cacti were established in Tucson Mountain Park (Fig. 2) in 1994. Evaluations 16 of the selected cacti occurred in 1994, 2002, 2010 and 2017. Each cactus was designated with a plot and cactus number, used as its reference in the database resulting from the evaluations. Fig. 2.
Figure 2. A population of saguaro cacti found in Tucson Mountain Park in Tucson, Arizona.
Printy
The Manhattan Scientist, Series B, Volume 5 (2018)
145
Data sets generated For each of the four sampling periods (1994, 2002, 2010, 2017), samples of the ribs of saguaro cacti were evaluated for bark coverage. The cacti have a total of twelve ribs; each rib has a single crest protruding from the cactus and two troughs, or indentations, on each side of the crest. Ribs that were facing each of the cardinal directions (south, west, east, north) most closely were evaluated. Both the crest and the troughs for each rib were evaluated. Samples were taken 1.75 m from the ground and were 8 cm long. Percent green was initially calculated and then converted into percent bark for each surface for further analysis. Percent bark coverage for the evaluated crests and troughs were then assigned plot and cactus numbers and were uploaded into a Microsoft Excel file for each of the four sampling periods. This Master File contains over 55,000 data points. 17
Fig. 3.
Figure 3. Visual representation of the barking process on saguaro cacti. Barking will occur on the southern surfaces, specifically the southern crest, first and then travel around the eastern and western surfaces. The barking will then extend to the northern surfaces, which will be the last surfaces to experience bark coverage.
Structure of analysis The data of this study is outlined in Fig. 4. Data of the Master File was first placed into MATLAB Validate Model. Validate Model was used to determine the rate of bark coverage on cactus surfaces as being fast coverage, normal coverage, and slow coverage rates. Fast and slow cacti were characterized as having values that were two times the standard deviation from normal. Three sets of cactus surfaces were used as pairs. The paired surfaces were (A) west-left trough and north-left troughs, (B) north-left troughs and east-right troughs, and (C) south-right troughs and west-left troughs. In all three cases, the paired predictors were used to predict bark coverages on north-right troughs. Once the pairs were run through Validate Model, fast, normal, and slow cacti were identified. The next step was to compare the cacti of these three groups for Pair A and Pair B. The purpose of these comparisons was to select cacti that were unique fast for A separated from the unique fast for B. Overall, comparisons were made between fast and slow: A vs. B, A vs C, and B vs. C. This gave six unique populations of cacti. This process was done for each of the four sampling times (1994, 2002, 2010, and 2017). Therefore, there were 24 unique populations. Each population was assigned to a rate (fast or slow), a pairing (A vs. B), and a year. As stated previously, the north-left and west left troughs were used to predict bark coverage rates on north-right troughs. The cacti were then in three different groups, slow, normal and fast. After the fast, normal and slow populations were selected, then decision trees were produced, which included north-right troughs with the other eleven surfaces to separate normal cacti from fast cacti and normal cacti from slow cacti.
146
The Manhattan Scientist, Series B, Volume 5 (2018)
Printy
Figure 4. A flow chart depicting the process of obtaining data, specifically for Pair A. Data from an excel spreadsheet containing over 55,000 data points from four different time periods is run through MATLAB programs Validate Model, REM High Low, and Leave One Out. This produces an excel spreadsheet listing high and low outliers, as well as cacti barking at rates considered ‘normal’, for selected surfaces. This data is then organized by year and analyzed by the DEC Tree program to produce a decision tree and confusion matrix. Additionally, the data produced by the MATLAB programs is analyzed to compare means and significant differences.
Each of the above populations was placed through a decision tree, producing a confusion matrix. In addition, T-test comparisons were run between the fast and the normal, as well as the slow and the normal for each population. Computer programs Predicting bark coverage 1. The data points used were taken from the Excel Master File 2. The data from the Master File was then run through Validate Model. The algorithm prepares a distribution of the data supplied by the master Excel file. The algorithm uses selected predictive surfaces to predict the rates of barking on a selected surface. The algorithm provides a histogram (Fig. 5) displaying the range of cacti within the selection and the standard deviation of the selection. The algorithm shows both “slow” outliers, which are cacti whose rates of bark coverage are two standard deviations below the mean and “fast” outliers, which are cacti whose rates of bark coverage are two standard deviations above the mean. 3. The information provided by Validate Model is then analyzed by Leave One Out. The algorithm analyzes the information produced by Validate Model to compare the observed barking percentage and the predicted barking percentage. The comparison the algorithm makes provides a value of error.
Printy
The Manhattan Scientist, Series B, Volume 5 (2018)
147
Figure 5. The histogram produced by the MATLAB program Validate Model. It shows the distribution of samples collected over all four time periods. The slow outliers, or any data point two significant differences less than the mean, are located on the left portion of the histogram, while the fast outliers, or any data point two significant differences greater than the mean, can be found on the right portion of the histogram.
4. The data is then analyzed by REM High Low. The algorithm analyzes the information produced by Leave One Out and the data provided by the master excel file. The algorithm produces multiple sets of data. It produces one containing the information for fast outliers, one for slow outliers, and one for cacti considered to have a normal rate of barking. Data for all four sampling periods (1994, 2002, 2010, and 2017) are provided for each cactus listed as fast, slow, or normal. Predicting accuracy 1. The process of predicting barking accuracy begins with the data set generated by the REM High-Low program. 2. The data for fast and slow outliers are compared to that of the normal cacti by the DEC Trees program. The program analyzes the fast, slow and normal cacti for each sampling period (1994, 2002, 2010, 2017) to produce a confusion matrix and decision tree for each comparison. These products generate the accuracy of barking on the north-right trough. Mean comparison The Master File was used to compare the means generated for bark coverage over time. A copy of the Master File was edited to contain only the fast and slow outliers. The twelve surfaces of each outlier cactus were analyzed by the year they were evaluated (1994, 2002, 2010, 2017). The average bark coverage percentage was calculated for all fast and slow outliers, separately, per year. The averages were also calculated for normal cacti. These values were then converted to statistical measurements. T-tests were run on these values to determine if the fast and slow outlier data was significantly different from the normal cactus data.
Results The purpose of this study is to use bark coverages on several predictor surfaces to predict bark coverage on north-right troughs. For this analysis, machine learning programs will be used for the predictions of how well surfaces predict as well as how they determine rates of bark coverage on north-right troughs. Specifically, the data will be partitioned into rates of coverage that are normal, slow, and fast.
148
The Manhattan Scientist, Series B, Volume 5 (2018)
Printy
The first step in this research was to determine which cacti qualified as fast, slow, or normal. The Validate Model program was used to separate cacti based upon the rate of bark coverage on north-right troughs. For this analysis, cactus coverages on north-left troughs and west-left troughs were used as predictor surfaces. The output from Validate Model is shown in Fig. 5. The regions of fast outliers and slow outliers are indicated. The mean values of all surfaces and T-test results from this analysis are shown in Table 1. The north right and north left troughs for all fast outliers have Table 1. Comparison of mean values of high and low Pair A (north left trough and west left trough) outliers and normal mean values per year. South Crest
South South Right Left Trough Trough
East Crest
East East North Right Left Crest Trough Trough
North North Right Left Trough Trough
West Crest
West West Right Left Trough Trough
1994 Slow Normal Fast
60.3 32.6 65.1
31.6 14.3 40.1
42.8 14.1 29.4
53.5 26.3 56.3
20.9 10.7 28.8
35.0 15.3 35.6
31.0 16.2 75.0*
5.0 5.2 50.0*
15.0 7.3 54.6*
71.3* 21.7 43.3
52.1* 12.6 29.4
43.3 7.0 9.5
2002 Slow Normal Fast
71.3 52.1 74.4
52.5 29.0 48.8
63.1 29.6 38.1
70.0* 39.7 66.3
43.1 19.5 37.3
46.3 28.1 43.5
53.1 24.4 81.9*
10.6 9.7 53.1*
20.6 11.9 60.6*
82.5* 34.7 54.4
67.5* 23.8 41.9
56.3* 13.5 19.4
2010 Slow Normal Fast
76.3 97.9 95.0
57.1 72.1 78.4
65.4 77.1 78.1
77.5 95.7 90.0
46.5 65.0 51.8
62.6 84.3 65.3
68.1 65.0 89.4*
19.8 35.6 71.3*
42.8* 52.9 78.8*
92.5* 70.7 78.1
86.9* 59.3 55.0
80.4* 41.4 21.9
2017 Slow Normal Fast
85.0 99.3 97.5
64.4 77.9 86.3
73.8 93.6 81.6
86.9 97.9 95.6
63.4 77.9 61.3
75.6 95.0 74.4
72.5 91.4 93.1
36.3 37.7 82.5*
58.8* 77.9 85.0*
98.5* 88.6 88.8
98.8* 74.3 60.6
96.3* 50.0 33.8
*Values with an asterisk are significantly different from normal values in the same year with a probability less than 0.05.
significantly higher bark coverage percentages than the normal for any of the given time periods. The north crest also displayed a similar pattern for the first three sampling periods. The west crest and west-left trough have significantly higher rates of barking for all four time periods. The west left trough is significantly different for the last three time periods. For the normal cacti, the crests increase at significantly faster rates; the south crest increases from 32% to 99%, east crest increases from 26% to 98%, north crest increases from 16% to 91%, and west crest increases from 22% to 88%. Table 2 shows the prediction of bark coverage using the results of the A vs. B data from Validate Model in 1994, 2002, 2010 and 2017. The data was run through DEC Trees in order to predict that accuracy of the predictive surfaces. The accuracy for 1994 data for fast outliers is 97.1%. By 2017, the accuracy is 98.5% for fast outliers. The accuracy for slow outliers in 1994 is 97.9%. For 2002, the accuracy was 96.9%. For 2010, the accuracy was 98.1%. The accuracy for
Printy
The Manhattan Scientist, Series B, Volume 5 (2018)
149
Table 2. Comparisons of the predictor surfaces, the number of cacti, and the accuracies provided by DEC Trees for fast and slow Pair A (north left trough and west left trough) outliers for each time period. Number of cacti: 478. Accuracy (%)
Time
Type
1994
Fast
North Right <40%, East Crest <99%, East Right ≥28%, East Crest <33%
97.9
2002
Fast
North Right <88.3%, North Right <17.5%, South Left <55%, North Crest <45%
96.9
2010
Fast
98.1
2017
Fast
North Left ≥45%, West Left <37.5%
North Right <73%, North Right ≥48%, East Right <23%, North Right ≥73%, West Left <45%
98.5
1994
Slow
97.9
2002
Slow
West Crest ≥99.5%, South Crest <1%, West Crest ≥99.5%, South Left ≥55%, East Left <13% West Left<73.5%, North Left<15%, North Right<22.5%
97.3
West Left<77.5%, North Left<39%, South Left<45%, West Left ≥77.5%, South Crest<7.5%, West Left ≥77.5%, South Crest ≥7.5%, North Right<32.5%, North Right ≥27.5%, West Left ≥77.5%, South Crest ≥7.5%, North Right<32.5%, North Right<27.5%, North Left ≥55%
97.1
2010
Slow
2017
Slow
Predictor Surfaces
West Left≥ 87.5%, South Left<7.5%, West Left ≥87.5%, South Left ≥7.5%, West Left ≥92.5%, North Left<80, North Left ≥65%
97.3
slow outliers in 2017 is 97.3%. DEC Trees gives an accuracy of 97% or greater for all predictions using the north-right trough and west left trough as predictive surfaces. For all fast surfaces, a north facing trough was the initial predicting surface. All had a predicting accuracy of above 96.9%. For all slow surfaces, west crest or west facing troughs was the initial predicting surface. All had a predicting accuracy of above 97.1%. The accuracy for 1994 was 97.9%. The accuracy for 2002 was 97.3%. For 2010, the accuracy was 97.1%. For 2017 data, the accuracy was 97.3%. DEC Trees also produces a confusion matrix for each decision tree. Confusion matrices detail the accuracy of the predictions by comparing the predicted and the actual. Table 3 shows the results of the confusion matrices for fast outliers for 1994, 2002, 2010, and 2017. Table 4 shows the results of the confusion matrices for slow outliers for 1994, 2002, 2010, and 2017.
Discussion The purpose of this study was to use bark coverages on several predictor surfaces to predict bark coverage on north-right troughs. For this analysis, machine learning programs, specifically Validate Model and DEC Trees, were used for the predictions of how well surfaces predict as well as how they determine rates of bark coverage on north-right troughs. Bark coverage on specific cactus surfaces was predicted by Validate Model, while DEC Trees predicted the accuracy with which a predictive surface could be identified. More than 20 species of tall, long-lived cactus species in the Americas have been documented as having barking coverage. For these cacti, barking coverage begins on the equatorial surfaces of the cacti, which then travels towards polar surfaces. For cacti located in the northern hemisphere,
150
The Manhattan Scientist, Series B, Volume 5 (2018)
Printy
Table 3. Confusion matrices of predicted and actual numbers of cacti with fast vs. normal rates of bark coverage on the north-right trough based off of fast Pair A data for each sampling period. A. 1994 Normal-Fast
B. 2002 Normal-Fast
Predicted Fast Normal Actual
Fast Normal Accuracy
| |
3 5
5 465 97.9%
Predicted Fast Normal Actual
C. 2010 Normal-Fast
Fast Normal Accuracy
| |
Fast Normal Accuracy
| |
3 5
7 462 96.9%
D. 2017 Normal-Fast
Predicted Fast Normal Actual
1 8
5 466 98.1%
Predicted Fast Normal Actual
Fast Normal Accuracy
| |
5 3b
3 467 98.5%
such as saguaro cacti (Carnegiea gigantea), the southern surfaces will begin to bark first. Sunlight hits the south-facing surfaces four times more than north-facing surfaces on cacti located on the 32â&#x2014;Ś latitude, where the cacti used for this study are located. Additionally, barking will occur first on crests rather than troughs. Once all of the crests on a cactus have significant bark coverage, barking will begin on the troughs. The barking begins on west-facing left troughs and east-facing right troughs before spreading to the remaining troughs. Following this barking pattern, the northright trough is the final surface to experience significant bark coverage before cactus death. Once a cactus reaches more than 80% barking on the north-right trough, it will be dead in 8 years. Table 4. Confusion matrices of predicted and actual numbers of cacti with fast vs. normal rates of bark coverage on the north-right trough based off of slow Pair A data for each sampling period. A. 1994 Normal-Slow
B. 2002 Normal-Slow
Predicted Slow Normal Actual
Slow Normal Accuracy
| |
467 7
3 1
Predicted Slow Normal Actual
97.9%
C. 2010 Normal-Slow
Slow Normal Accuracy
| |
Slow Normal Accuracy
| |
463 7
7 1 97.1%
5 0 97.3%
D. 2017 Normal-Slow
Predicted Slow Normal Actual
465 8
Predicted Slow Normal Actual
Slow Normal Accuracy
| |
464 7
6 1 97.3%
Printy
The Manhattan Scientist, Series B, Volume 5 (2018)
Figure 6. A decision tree produced by the analysis of Pair A 1994 fast outliers by MATLAB. A sample will be considered fast if it has less than 40% bark on the north-right trough, less than 99% bark on the east crest, more than or equal to 28% bark on the east right trough, and less than 33% bark on the east crest.
151
Figure 7. A decision tree produced by the analysis of Pair A 1994 slow outliers by MATLAB. A sample will be considered slow if it has more than or equal to 99.5% bark on the west crest and less than 1% bark on the south crest. It will also be considered slow if it has more than 99.5% bark on the west crest, more than or equal to 55% bark on the south left trough and less than 13% bark on the east left trough.
Three sets of paired surfaces were used to determine the predictability of the north-right trough. Through the use of DEC Trees, it was determined that Pair A (north-left trough and westleft trough) had the best predictability. The north-left trough serves as a good predictor of the north-right trough because of its close proximity to the north-right trough. The means of bark coverage on the north-left trough were the most similar to those of the north-right trough for every sampling period. The west-left trough was used in pairing with the north-left trough because it is one of the first trough surfaces to experience bark coverage. However, the means for the west-left troughs have a wide range in coverage. The similarities of the means for the north-left trough and west-left trough indicate a relationship with the north-right trough. Using the data provided by Validate Model, DEC Trees produced high accuracies for all four sampling periods, which average to 97.9% for fast outliers and 97.4% for slow outliers. The accuracies for the four sampling periods were close in value, as shown in Table 5. The two additional pairs of surfaces were used in comparison with Pair A in an attempt to see a range of predictabilities. Pair B (north-left trough and east-right trough) also gave similar, but slightly lower accuracies, as shown in Table 5. This was expected as the east-right trough is located on the same position as the west-left trough on the opposite side of the cactus. Pair C (west-right trough and south-right trough) predictability was not as accurate as those of Pair A and Pair B. The two surfaces are located the furthest away from the north-right trough of the six predictive surfaces used. The average
152
The Manhattan Scientist, Series B, Volume 5 (2018)
Printy
accuracies are listed in Table 5. Table 5. Accuracies produced by DEC for Pair A (north-left trough and west-left trough), Pair B (north-left trough and east-right trough) and Pair C (west-right trough and south-right trough) for all four sampling periods. Outlier Type Pair A (north-left trough and west-left trough) Pair B (north-left trough and east-right trough) Pair C (west-right trough and south-right trough)
Range of Accuracies
Total Average Accuracy 97.6%
Fast
96.9 - 98.5%
Slow
97.1 - 97.9%
Fast
91.3 - 97.5%
Slow
97.2 - 98.3%
Fast
88.4 - 90.1%
Slow
88.0 - 89.2%
96.5% 88.9%
Additionally, when the north-left trough and west-left trough were used as the predictive surfaces, the west-left trough followed two trends when means were compared. Over all four time periods for slow outliers, the west-left trough mean values were significantly higher than the mean values for the north-right and north-left troughs. Over all four time periods for fast outliers, the west-left trough mean values were significantly lower than the mean values for the north-right and north-left troughs.
Acknowledgement
This work was supported by the Catherine and Robert Fenton Endowed Chair in Biology.
References
Anderson, E. F. 2001. The Cactus Family. Timber Press. Portland OR 776 p. DeBonis, M., L. Barton, and L. S. Evans. 2017. Rates of bark formation on surfaces of saguaro cacti (Carnegiea gigantea). Journal of the Torrey Botanical Society. 144: 1-8. Evans, L. S., J. H. Sullivan, and M. Lim. 2001. Initial effects of UV-B radiation on stem surfaces of Stenocereus thurberi (organ pipe cacti). Environ. Exp. Bot. 46: 181â&#x20AC;&#x201C;187. Evans, L. S., A. J. Young, and Sr. J. Harnett. 2005. Changes in the scale and bark stem surfaces injuries and mortality rates of a saguaro (Carnegiea gigantea) cacti population in Tucson Mountain Park. Can. J. Bot. 83: 311-319. Evans, L. S., P. Boothe, and A. Baez. 2013. Predicting morbidity and mortality for a saguaro cactus (Carnegiea gigantea) population. J. Torrey Bot. Soc. 140: 247-255. Evans, L. S. and M. DeBonis. 2015. Predicting Morbidity and Mortality of Saguaro Cacti (Carnegiea gigantea) J. Torrey Bot. Soc. 142: 231-239. Steenbergh, W. F. and C. H. Lowe. 1977. Ecology of the Saguaro II Reproduction, germination, establishment, growth and survival of the young plant. National Park Service Monograph Series Eight.
Growth dynamics of Artemisia tridentata Claudia S. Ramirezâ&#x2C6;&#x2014; Laboratory of Plant Morphogenesis, Department of Biology, Manhattan College Abstract. Stems of Artemisia tridentata Nutt. ssp. tridentata exhibit a cyclic growth pattern that starts in June and ends in November every year. Terminal stem lengths and the number of branches increase throughout the growing season. The mean growth rate for terminal stem lengths was 0.81 mm per day with an r2 value of 0.30, reflecting much variability. The number of branches mean increase rate increase was 0.08 mm per day with an r2 value of 0.27, reflecting much variability. Due to increased variation amongst samples, data was standardized using deciles. Results show cumulative length of branches increases linearly with stem deciles for both vegetative and reproductive samples. Within the vegetative stems slopes and y-intercept values ranged from 10.7 to 291 mm per decile, and -185 to 232, respectively. In contrast, within the reproductive stems slopes and y-intercept values ranged from 30.8 to 299 mm per decile, and -430 to 218, respectively. Number of seeds increases linearly with cumulative branch length. The number of seeds increased at a rate of 1.05 per cumulative branch length (y = 1.05x â&#x2C6;&#x2019; 204; r2 = 0.89).
Introduction Artemisia tridentata ssp. tridentata, big sagebrush, is a tall, native shrub. Individual branches are short, and most stems have woody trunks (MacMahon, 1992; USDA plant guide). Depending on soil conditions, the various subspecies of Artemisia tridentata may grow to be up to four meters tall (Fig. 1). Prior to immigration of Anglo-Americans, species of sagebrush occupied most non-saline portions of Montana, Wyoming, Colorado, Utah, Idaho and Nevada below 3000 m elevation. In addition, sagebrush subspecies occupy eastern Washington, eastern Oregon, throughout California and into Baja California and Mexico above 1000 m elevation (total area = 1.5 million km2 ; Daubenmire, 1970; Welch, 2005). Within these areas, Artemisia tridentata is well suited to the environments of these regions and is the dominant species in undisturbed areas (Welch, 2005). Of course, these areas have become agricultural and used for other practices now a day (MacMahon, 1992; Welch, 2005). Sagebrush plants may live for hundreds of years and produce 10 to 40 terminal shoots each year. Each stem terminal may produce 20 to 30 flowering branches every year (Fig. 2; Evans, et al., 2012). Early in the growing season, many indeterminate, vegetative branches (Fig. 3) are produced and many become determinate, flowering branches (Fig. 4) at the end of the growing season (Evans et al., 2012). The purpose of this study is to focus on the branch growth dynamics of Artemisia tridentata from June until November 2015 to understand changes in terminal stem growth and branches through the period. It was hypothesized that for current-year growth: 1. Terminal stem lengths increase throughout the growing season. â&#x2C6;&#x2014;
Research mentored by Lance Evans, Ph.D.
154
Fig. 1
The Manhattan Scientist, Series B, Volume 5 (2018) Fig. 2
Figure 2. Sagebrush (Artemisia tridentata) with many flowering stem terminals showing the reproductive period of the growing season late October.
Fig. 4
Figure 3. A terminal stem sample of Artemisia tridentata. The vegetative stem sample only has leaves. The sample was harvested on June 9, 2015 (Day of Year 160). Image shows the junction pointed with a pencil near the base of the stem.
Fig. 3
Figure 1. Sagebrush plant (Artemisia tridentata) in vegetative state (only leaves) in the wild. Sagebrush plants grow as individual shrubs.
Ramirez
Figure 4. A flowering terminal stem from Artemisia tridentata from September 24, 2015 (Day of Year 267). Image shows the junction of current yearâ&#x20AC;&#x2122;s growth indicated by the blue pen.
2. The number of branches per stem increase during the growing season 3. Cumulative length of branches increases linearly with stem deciles. 4. Number of seeds increases linearly with cumulative branch length. Focusing on data from both, the branches and the main stem over time allowed us to propose a model for stem growth and branch development over the period. Understanding the growth patterns of Artemisia tridentata is very important since each branch produces hundreds of seeds, which play a large ecological role to the wildlife and inhabitants situated around sagebrush growth.
Materials and Methods The terminal stems analyzed in this study were shipped from Thistle, Utah (40.00â&#x2014;Ś N, 111.49â&#x2014;Ś W). Stem samples were randomly selected once a week from June to November 2015 and shipped to Manhattan College for processing. Each shipment consisted of six separate terminal
Ramirez
The Manhattan Scientist, Series B, Volume 5 (2018)
155
Table 1. Dates of sagebrush (Artemisia tridentata) stem samples from Thistle, Utah, during 2015. Sample
Day
Sample
Day
Sample
Day
June 04 June 09 June 17 June 25 July 02 July 09 July 16
155 160 168 175 183 190 197
July 23 July 30 August 07 August 13 August 20 September 03 September 10
204 211 219 225 232 246 253
September 17 September 24 October 01 October 08 October 15 November 02 November 23
260 267 274 281 283 306 327
stem samples. The samples were organized by date to ensure correct analysis. From each mailing box, random terminal stems were selected for examination for each date. The junction of previous yearâ&#x20AC;&#x2122;s growth from current yearâ&#x20AC;&#x2122;s growth was determined for the samples (Fig. 3, indicated by pencil near base of sample). All branches from current year were removed. Each detached branch was laid at the node and numbered starting from the junction towards the tip (Fig. 5). Images were taken to document the features of each stem sample. A ruler was placed in all pictures to ensure the scale was correct for subsequent measurements. Images were placed on ImageJ (National Institutes of Health, http://rsb.info.nih.gov/ij) for measurements of (1) stem lengths above junction, (2) stem diameter at junction, (3) the number of branches, and (4) branch lengths. Using the computer program Paint, a branch model was created for dissected samples (Fig. 6). To standardize cumulative branch data deciles were created to scale the variable stem lengths (Fig. 7). In addition, reproductive branches were placed under the dissecting scope (Fig. 8) and seeds were quantified for a length of 10 mm. The data of branch lengths and cumulative branch length of each sample was used to estimate the amount of seeds produced per sample analyzed. Data were placed in Excel (Microsoft Inc.) for analysis.
Results Terminal stem lengths increase throughout the growing season For this study, forty-eight branches of Artemisia tridentata spp. tridentata were obtained throughout the growing season (June through November 2015; Day of Year 155 to 327). The relationship between stem length of current year growth and Day of Year exhibited variability throughout the growing season (Fig. 9). Stems grew continuously from June through November. From Day of Year 150 through 225 all samples are vegetative (only had leaves). However, from Day of Year 225 through 275 there is a mix of vegetative and reproductive stems that continue to elongate. From Day of Year 275 through the end of the reproductive period in late November (Day of Year 350), stems continued to grow reaching about 300 mm in length. The mean growth rate was 0.81 mm per day with an r2 value of 0.30, reflecting much variability.
The Manhattan Scientist, Series B, Volume 5 (2018)
Figure 5. A flowering terminal stem of Artemisia tridentata after dissection, from September 24, 2015 (Day of Year 267). Branches have been removed from the stem, assigned a number and placed at the node. Junction showing current year Fig. 7 growth is indicated by a red â&#x20AC;&#x153;Xâ&#x20AC;? near the base of the stem.
Ramirez
Figure 6. A terminal stem of Artemsia tridentata from July 2, 2015; all branches have been removed. The locations of all branches has been pinpointed using Microsoft Paint. The pointer marks the location of the beginning of growth for 2015 (junction). Branches are marked as (V) for vegetative. Fig. 6
Fig. 5
156
Figure 7. Procedure to create growth deciles to standardize stem lengths. The hypothetical June sample had internodes 15 mm in length while the hypothetical November sample had internode lengths of 30 mm. Each standardized stem was 10 deciles in which the hypothetical lengths was 10 mm per internode. Fig. 8
Figure 8. Image of a reproductive branch with seeds of Artemsia tridentata enlarged 10Ă&#x2014;.
Ramirez
Fig. 9
The Manhattan Scientist, Series B, Volume 5 (2018)
Figure 9. Relationship between stem lengths of the currentyear growth and Day of Year for 48 samples of Artemisia tridentata (y = 0.81x + 39.3; r2 = 0.30). Green data represent vegetative growth and orange data represent reproductive growth. The vegetative period was from day 150 through 225 and the reproductive period was between day 275 through 350. For days 225 through 275 there was a mixture of vegetative and reproductive growth.
Fig. 10
157
Figure 10. Relationship between number of branches on terminal stems and Day of Year for current year growth of 48 samples Artemisia tridentata (y = 0.08x+7.75; r2 = 0.27). Green data represent vegetative growth and orange data represent reproductive growth.
The number of branches per stem increase during the growing season. The number of branches per terminal stem increased with a rate of 0.08 mm per day from June to November 2015 (Fig. 10). The mean rate of increase was 0.08 mm per day with an r2 value of 0.27, reflecting much variability. Cumulative length of branches increases linearly with stem deciles Based upon the variability among samples throughout the growing season (Figs. 11 and 12) data were standardized in deciles. After standardization, cumulative branch lengths were linearly related with deciles (Fig. 13; Table 2). For the forty-seven (47) stems, all r2 values were between 0.84 and 0.99. For the twenty-seven (27) vegetative stems (only leaves present) the mean slope was 48.5 within a y-intercept near zero. In contrast, for the twenty (20) reproductive stems (flowers and seeds present) the mean slope was 164 with a y-intercept value of -164. A T-test showed statistically significant differences for slopes and y-intercepts for the two groups. Within the vegetative stems slopes and y-intercept values ranged from 10.7 to 291 mm per decile, and -185 to 232, respectively. In contrast, within the reproductive stems slopes and yintercept values ranged from 30.8 to 299 mm per decile, and -430 to 218, respectively. Clearly, cumulative branch lengths increased between the two stem groups.
158
The Manhattan Scientist, Series B, Volume 5 (2018)
Ramirez
Number of seeds increases linearly with cumulative branch length For the twelve samples from September through November 2015, the relationship between cumulative branch length and number of seeds was strong. The number of seeds increased at a rate of 1.05 per cumulative branch length (y = 1.05x − 204; r2 = 0.89; Fig. 14).
Fig. 11
Fig. 12
Figure 11. Image shows the variety of samples taken July 2, 2015, for vegetative growth. Stem lengths vary from 128 mm to 345 mm showing the significant variability between stems in the wild. Fig. 13
Figure 13. Relationship between cumulative branch lengths and deciles for 27 vegetative stem samples (green data) (y = 31.3x − 5.2, r2 = 0.26) and 20 reproductive stem (orange data) samples (y = 178.5x − 176.6, r2 = 0.56) of Artemisia tridentata.
Figure 12. Image shows the variety of samples taken October 8, 2015, for reproductive growth. Stem lengths vary from 185 mm to 308 mm showing the significant variability between samples in the wild. Fig. 14
Figure 14. Relationship between the number of seeds and cumulative branch length for 12 terminal stems of Artemisia tridentata (y = 1.05x − 204.54, r2 = 0.89) from late September through November.
Ramirez
The Manhattan Scientist, Series B, Volume 5 (2018)
159
Table 2. Slopes, y-intercepts, and r2 values for relationships between deciles and cumulative branch lengths for individual vegetative and reproductive stem samples of Artemisia tridentata for 2015. Vegetative a
Sample
Slope
June 04-01 June 04-02 June 04-03 June 09-01 June 17-01 June 17-02 June 17-03 June 17-04 June 17-05 June 25-01 July 02-01 July 09-01 July 16-01 July 23-01 July 23-02 July 23-03 July 30-01 August 07-01 August 07-02 August 07-03 August 13-03 August 20-03 September 03-02 September 10-01 September 17-01 September 24-02 MEAN S.D. T-test
21.2 21.6 19.4 93.6 27.1 14.6 13 13.4 17.5 13.7 37.1 44 291 74.5 21.2 46.4 26 24 84 33 16 13 193 41.5 10.7 85.2 48.5 61.1 <0.01
a
Reproductive 2
y-intercept
r
72.7 125 -11.5 -180 11.9 -17 -0.38 -24.1 -25.3 78.5 -46.7 232 110 -66.4 -12.8 -58.2 -19.9 12.8 -185 -78.2 31.4 6.39 -96.8 -28.3 -1.38 139 -1.47 87.9 <0.01
0.96 0.91 0.92 0.97 0.93 0.99 0.93 0.96 0.96 0.90 0.98 0.92 0.99 0.97 0.98 0.96 0.95 0.98 0.95 0.84 0.93 0.98 0.94 0.97 0.96 0.85 0.95 0.04 0.26
Sample
Slope
y-intercept
r2
August 13-01 August 13-02 August 20-01 August 20-02 September 03-01 September 10-02 September 17-02 September 17-03 September 17-04 September 24-01 September 24-03 September 24-04 October 01-02 October 01-13 October 08-01 October 15-01 October 15-02 November 02-01 November 23-01 November 23-02 August 13-04
106 128 30.8 217 87 254 239 160 113 182 162 270 136 141 222 117 99.4 157 169 299 12.4
-148 -259 -76.1 -63.3 -140 -156 -140 -337 -197 -255 -216 -123 -52.9 -128 218 -316 -55 -81.6 -321 -430 -8.5
0.99 0.96 0.93 0.97 0.98 0.95 0.98 0.94 0.98 0.99 0.97 0.97 0.99 0.99 0.84 0.88 0.96 0.99 0.96 0.98 0.99
MEAN S.D.
164 66.5
-164 136
0.96 0.04
Slopes, y-intercepts, and r2 values were computed with Excel using a linear fit.
Discussion This study documents the seasonal development of sagebrush plants from early spring through late fall. Since sagebrush is a perennial plant, terminal stems become dormant the previous winter and grow the following spring. Although there is much variability in stem characteristic at each sampling, main stems elongate linearly, and the number of branches increase throughout the growing season (Days of year 150 through 320). Plants produce branches with leaves during the early part of the season and flowers and seeds by mid-season. Stems began to produce flowers and seeds at the beginning of October; some started to produce flowers from as early as mid-August. Branches near junctions elongated proportionally more than branches near terminals (tip) dur-
160
The Manhattan Scientist, Series B, Volume 5 (2018)
Ramirez
ing the reproductive period. Branches near the junction elongate more than branches near terminals. Thus, branches near the junctions produced more seeds than branches near terminals. All branches that produced seeds were determinate branches and thus die at the end of the growing season. In contrast, several indeterminate branches, that remain vegetative survive and become stem terminals for the next growing season. Overall, about thirty branches become determinate while only two to three branches are indeterminate. As stated above, there was considerable variation in stem characteristics at each sampling. Due to this variability, samples were standardized using deciles. Once the data were standardized, the cumulative branch lengths were longer for reproductive samples than for vegetative samples. Moreover, numbers of seeds were linearly correlated with cumulative branch lengths. On average, each plant had about twenty terminal stems and each terminal stem had thirty determinate branches producing about 1,400 seeds each. Thus, each plant produces 840,000 (20 Ă&#x2014; 30 Ă&#x2014; 1400) seeds. Clearly, more samples will be processed over the next year and other growth aspects will be studied. Results from continued studies will provide a more complete analysis of the growth, development and reproductive process that occurs for A. tridentata during the growth season.
Acknowledgment The author is grateful to the Catherine and Robert Fenton Endowed Chair in Biology to Dr. Lance Evans for financial support and mentorship for this research. The author is also grateful to the Linda and Dennis â&#x20AC;&#x2122;78 endowed biology research fund for financial support.
References Daubenmire, R. 1970. Steppe vegetation of Washington. Tech. Bull. 62. Pullman: Washington State University, Washington Agricultural Experiment Station, College of Agriculture. 131 p. Diettert, R.R. 1938. The morphology of Artemisia tridentata. Nutt. Lloydia 1:3-74 Evans, L.S., A. Citta, and S.C. Sanderson. 2012. Flowering branches cause injuries to secondyear main stems of Artemisia tridentata Nutt. Subspecies tridentata. Western North American Naturalist. 72 :447-456. MacMahon, J. A. (1992). Deserts. New York: Knopf. USDA plant guide, United States Department of Agriculture, Natural Resources Conservation Service, Plant Guide. http://plants.usda.gov/plantguide/pdf/pg artr2.pdf Welch, B. 2005. Big Sagebrush: a sea fragmented into lakes, ponds, and puddles. General Technical Report RMRS-GTR-144, Fort Collins, CO
Xylem conductivity in stems of Artemisia tridentata Victoria Webb∗ Laboratory of Plant Morphogenesis, Department of Biology, Manhattan College Abstract. Artemisia tridentata is the dominant shrub species in the Great Basin Desert. They cover vast regions across the Northwest United States. Sagebrush obtains most of its water from winter snowfall. This moisture is used to promote the plant’s growth. The growth cycle of Artemisia tridentata begins with vegetative growth in the spring and ends with reproductive growth in the fall. Reproductive terminal stems grow rapidly to produce approximately 200,000 seeds per plant in late autumn. The purpose of this study was to determine relationship between xylem conductivity and stem and cumulative branch lengths. Data for vegetative stems and reproductive stems was compared at the transition zone and as the stems elongated. Xylem vessels were counted, xylem conductivities were calculated, and stem and branch lengths were measured for stem samples at the transition zone and along the entire length of both vegetative and reproductive stems. The total number of xylem vessels was positively related to stem lengths and cumulative branch lengths at the transition zone (y = 0.465x + 1.16 × 103 ; r2 = 0.50). Xylem conductivity was also positively related to stem lengths and cumulative branch lengths at the transition zone (y = 4.02 × 10−4 x + 0.708; r2 = 0.37). Parameters for reproductive stems were significantly greater than the parameters for vegetative stems.
Introduction Artemisia tridentata, otherwise known as ‘Big Sagebrush’ (Fig. 1), is a shrub that grow primarily in a region of the northwestern United States called the Great Basin Desert (McMahon, 1985; West, 1999; Welch, 2005). Sagebrush plants cover large expanses of Utah, Idaho, Nevada, Montana, Colorado, and Wyoming. These shrubs grow from 2 feet to 13 feet tall (Daubenmire, 1970; USDA Plant Guide). Sagebrush plants obtain water from snow during winter months because little precipitation takes place during non-winter months. These shrubs are perennials, beginning their growth cycle in April with vegetative growth and ending the cycle with reproductive growth in November (Figs. 2 and 3; Evans et al., 2012) with dormancy in the winter. Consequentially, stem, branch, and seed production takes place during a short period of time. Early in the growth cycle, vegetative stems elongate and produce a large number of branches with leaves. Late in the growth cycle, the leaves die and flowers and seeds are produced (Figs. 2 and 3; Evans et al., 2012). Sagebrush plants obtain soil moisture from snow and snow melt during winter and spring and this moisture must sustain these plants throughout their growth cycle. For growth to occur, water must be transported to the plant’s stems, branches, flowers, and seeds. Xylem vessels are responsible for moving water throughout plants. This study documented xylem development and xylem conductivity in elongating vegetative and reproductive stems of Artemisia tridentata. Our purpose was to determine (a) whether stems have a greater number of xylem vessels and xylem conductivity when they are reproductive rather than when they are vegetative, and (b) whether there is a constant relationship between xylem conductivity and stem lengths and cumulative length of all branches on the stem. ∗
Research mentored by Lance Evans, Ph.D.
162
The Manhattan Scientist, Series B, Volume 5 (2018)
Webb
Figure 1. Image of several plants of Artemisia tridentata. Note the individual terminal shoots that are grayish in color.
Figure 2. Excised vegetative terminal stem of Artemisia tridentata. Note the large number of branches. The pencil point indicates the transition zone between stem growth in 2014 (left) and in 2015 (right).
Figure 3. Excised terminal stem of Artemisia tridentata with flowers/seeds on every branch. Note the large number of branches. The pen point indicates the transition zone between stem growth in 2014 (left) and in 2015 (right).
It was hypothesized that: 1. At transition zones, xylem conductivities are larger for reproductive stems than for vegetative stems. 2. Xylem conductivities are proportional to stem lengths and cumulative branch lengths as stems elongate and produce more branches.
Materials and Methods Processing stem samples For the months of June through November in 2015, six randomly selected stem samples of Artemisia tridentata were cut each week in Thistle, Utah (40.0â&#x2014;Ś N, 111.5â&#x2014;Ś W) and sent to Manhattan College. To begin the processing of each stem sample, a mark was placed at the transition zone, the location between 2014 stem growth and the current yearâ&#x20AC;&#x2122;s 2015 growth. All branches prior to 2015 were removed, to allow for the processing of stems from 2015 only. All 2015 branches
Webb
The Manhattan Scientist, Series B, Volume 5 (2018)
163
were placed on plain white paper at its node on the stem and assigned a number (Figs. 4 and 5). Every image included a ruler to provide a scale. Images were uploaded into ImageJ (NIH, https://imagej.nih.gov/ij/). ImageJ was used to obtain measurements for the lengths of branches and stems.
Figure 4. Excised terminal stem of Artemisia tridentata with flowers/seeds on every branch. To show the distribution of branches, each branch was removed and placed next to its node of origin. Note the lengths of the branches. An â&#x20AC;&#x153;Xâ&#x20AC;? was placed at the transition zone.
Figure 5. Excised terminal stem of Artemisia tridentata shown in Fig. 4. The branches have been removed and the branch numbers are placed at their position so that the location of each branch along the main stem could be determined.
Histology Stem tissue samples were collected at various locations (e.g. TH01, TH02) (Fig. 6). Stem tissue samples were fixed in FAA solution for 24 hours (Jensen, 1962) and dehydrated in a tertiary butanol (BX1805-1, Fisher Scientific) series. Samples were embedded in Paraplast wax (McCormick Scientific, Richmond, IL) and cut at 35um with a rotary microtome. Cross sections were put onto microscope slides, stained using safranin, and covered with Canada balsam (CAS800747-4, Acros, Fisher Scientific, Pittsburgh, PA).
Figure 6. Image of the excised terminal stem of Artemisia tridentata. The branches have been removed, branch numbers have been placed at their position and tissue sample numbers have been added. The pencil indicates the transition zone between 2014 and 2015 growth.
164
The Manhattan Scientist, Series B, Volume 5 (2018)
Webb
Numbers of xylem vessels per tissue section Analyses of numbers of xylem vessels varied depending upon the overall development of the xylem tissues. Less mature stem tissues had distinct vascular bundles with little secondary xylem while more mature stem sections had extensive secondary xylem. For stem tissues with distinct vascular bundles, all vessels in two randomly selected bundles were analyzed. Values in these two bundles were scaled up to estimate the number of vessels in entire stems. In more mature stem sections, one quarter of the xylem area was randomly selected for analysis. Values from these quarter sections were used to estimate the number of vessels in entire stem sections. Calculations of xylem conductivities Images of stem cross sections were evaluated under 40 times magnification (Fig. 7). With ImageJ, diameters of seven to ten randomly selected xylem vessels were taken for each cross section (Fig. 8). Two diameter measurements, in micrometers, were taken for each vessel (Fig. 8). Diameters were converted to radii, which were averaged and used to calculate xylem conductivity (McCulloh et al., 2009). The xylem conductivity of a cross section (in units g·cm·MPa−1 ·s−1 ) was calculated using the Hagen-Poiseuille equation: π × number of conduits × average radius of conduits(cm)4 . 8 × viscosity of water
Figure 7. Image of a stem cross section of Artemisia tridentata. P = pith, X = xylem cells, and B = bark.
Figure 8. Xylem cells. Six xylem cells are marked with an “×”. There were many other xylem cells without an “×”.
Results Xylem characteristics at transition zones The transition zone characteristics of reproductive stems were significantly different (p < 0.01) from vegetative stems (Table 1). The radii of reproductive xylem vessels were not significantly different from radii of vegetative xylem vessels. Average stem lengths for vegetative
Webb
The Manhattan Scientist, Series B, Volume 5 (2018)
165
Table 1. Comparisons of stem characteristics at transitions between 2014 and 2015 of vegetative and reproductive Artemisia tridentata stems.
Sample
Stem lengths
Cumulative branch lengths
Sum of lengths
Conductivity (g·cm·MPa−1 ·s−1 )
Number of xylem vessels
118 159 160 146 177 211 205 168
312 238 124 551 296 437 236 313
430 397 284 697 473 647 441 481
1.27 1.26 0.656 0.152 0.691 0.689 0.649 0.767
1130 1160 1320 860 1180 1810 1320 1250
305 199 192 341 261 282 201 236 194 210 242 < 0.01
2530 366 95.7 2100 1490 1470 1090 2040 953 1370 1350 < 0.01
2840 565 287 2440 1750 1752 1290 2270 1150 1580 1592 < 0.01
1.76 0.527 1.16 1.60 0.846 1.59 1.81 1.42 1.69 2.02 1.44 < 0.01
2920 1620 1830 1860 1940 1220 1600 2310 2150 2410 1990 < 0.01
Vegetative Stems June 04-02 June 17-01 June 17-02 June 25-01 July 02-01 July 09-01 August 07-01 Mean Reproductive Stems August 20-01 September 10-01 September 17-01 September 17-02 September 24-01 October 01-01 October 01-02 October 08-01 October 15-01 November 02-01 Mean T-testa a
The T-Test results compared vegetative versus reproductive stems for each parameter above.
and reproductive stems were 168 and 242 mm, respectively. Average cumulative branch lengths for vegetative and reproductive stems were 313 and 1.35×103 mm, respectively. Numbers of xylem vessels for vegetative and reproductive stems were 1.25×103 and 1.99×103 , respectively. Numbers of xylem vessels were positively related to stem lengths and cumulative branch lengths (y = 0.465x + 1.13 × 103 ; r2 = 0.50; Fig. 9). Xylem conductivities were positively related to stem lengths and cumulative branch lengths (y = 4.02 × 10−4 x + 0.708; r2 = 0.37; Fig. 10). Average conductivities for vegetative and reproductive stems were 0.767 and 1.44 g·cm·MPa−1 ·s−1 , respectively. Xylem characteristics as stems elongate During stem elongation processes, xylem conductivities should increase from stem terminals to more mature tissues. The results above show that xylem conductivities are strongly correlated with stem lengths and cumulative branch lengths at transition zones. The purpose of this portion of the study was to determine how xylem conductivities differed as a function of distance from the terminals and cumulative branch lengths. Xylem conductivities for vegetative and reproductive
166
The Manhattan Scientist, Series B, Volume 5 (2018)
Webb
29
27
Xylem conductivity (g·cm·MPa-1·s-1)
Number of xylem vessels (102)
30
20
10
0 0
1
2
2
1
0 0
3
1
2
3
Stem lengths and cumulative branch lengths (103 mm)
Stem lengths and cumulative branch lengths (103 mm)
Figure 9. Relationship between number of xylem vessels with stem lengths and cumulative branch lengths at the transition zone between 2014 and 2015 growth for seven vegetative (circles) and ten reproductive (diamonds) stems of Artemisia tridentata (y = 0.465x + 1.16 × 103 ; r2 = 0.50).
Figure 10. Relationship between xylem conductivities with stem lengths and cumulative branch lengths at the transition zone between 2014 and 2015 growth for seven vegetative (circles) and ten reproductive (diamonds) stems of Artemisia tridentata (y = 4.02 × 10−4 x + 0.708; r2 = 0.37).
stems were similar as a function of cumulative stem lengths and cumulative branch lengths (Table 2). Distributions of these relationships are shown in Fig. 11. Table 2. Linear regression characteristics of stem xylem conductivities as a function of stem lengths and cumulative branch lengths for vegetative and reproductive stems of Artemisia tridentata for 2015 (see Fig. 11). Sample Vegetative Stems
Reproductive Stems
a
Date
Slope
Y -intercept
r2
June 17-01 June 17-02 June 25-01 July 02-01 August 07-01 Mean
3.22 x 10−3 1.89 x 10−3 2.45 x 10−4 1.06 x 10−3 1.88 x 10−3 1.66 x 10−3
-0.0597 0.0320 0.0120 0.0476 -0.184 -0.0304
0.97 0.90 0.80 0.75 1.0 0.89
August 20-02 September 17-01 October 01-02 Mean T-testa
3.02 x 10−4 4.02 x 10−3 9.61 x 10−4 1.76 x 10−3 0.94
-0.131 -0.0780 -0.120 -0.110 0.14
0.87 0.96 0.58 0.80 0.56
The T-test results compared vegetative versus reproductive stems.
Discussion Sagebrush plants are an important plant species in the western United States. The very cold winters and dry, hot summer environments can support few plant species. Plants like sagebrush occur in similar environments on Earth. There are a variety of plants inhabiting the Steppes of Mongolia and other areas of eastern Russia that endure conditions similar to those experienced by sagebrush. None of the steppe plants of Asia are as widespread or reproductively sound as sage-
Webb
The Manhattan Scientist,31 Series B, Volume 5 (2018)
167
Xylem conductivity (g·cm·MPa-1·s-1)
2
Figure 11. Relationships between xylem conductivities with stem lengths and cumulative branch lengths for individual stems of Artemisia tridentata. Data of five vegetative stems (green) and three reproductive stems (orange) are shown. Data of individual stems are shown in Table 2.
1
0 0
1
2
Stem lengths and cumulative branch lengths (103 mm)
brush. Thus, sagebrush plants are uniquely suited for their environment and are only supplanted by human activities. These plants, which can live for 50 to 150 years, are unsurpassed due to their short growth cycle and ability to produce large quantities of seeds. This study investigated the relationship between xylem conductivity with stem and cumulative branch lengths in vegetative and reproductive stems of Big Sagebrush. Vegetative stem lengths elongated from 118 to 211 mm with cumulative branch lengths averaging 313 mm by August 7, 2015 (Day of Year 219). These stems had about 1,250 vessels per stem and an average xylem conductivity of 0.77 g·cm·MPa−1 ·s−1 . By November 2, 2015 (Day of Year 306), stem lengths averaged 242 mm with cumulative branch lengths averaging 1,350 mm. These stems had about 2,000 xylem vessels and an average xylem conductivity of 1.44 g·cm·MPa−1 ·s−1 . Vegetative and reproductive stems are statistically different for all of the aforementioned parameters. There is a significant difference between the xylem conductivities for both vegetative and reproductive samples due to the difference in the number of xylem vessels. The xylem radii values were not significantly different, and therefore did not influence the differences in xylem conductivity between vegetative and reproductive samples. These data show the change in growth of Artemisia tridentata from April to November. The water obtained during the winter months led to rapid vegetative stem elongation. During the reproductive growth stage, stems experienced a clear growth in stem and branch lengths. This increase in length contributes to the abundant seed production that occurs in reproductive stems. Approximately 200,000 seeds will be produced per sagebrush plant from August to November. The high conductivity values and vessel numbers for reproductive stems coincide with the extensive reproductive growth and seed abundance. The relationship between the number of xylem vessels versus stem and cumulative branch lengths had an r2 value of 0.50, whereas the relationship between xylem conductivity versus stem and cumulative branch lengths had r2 = 0.37. These relationships will likely become stronger once more stems have been analyzed.
168
The Manhattan Scientist, Series B, Volume 5 (2018)
Webb
Acknowledgments The author is indebted to the Catherine and Robert Fenton Endowed Chair in Biology to Dr. Lance S. Evans for financial support of this research.
References Daubenmire, R. 1970. Steppe vegetation of Washington. Technical Bulletin 62. Washington State Agricultural Experiment Station, College of Agriculture, Washington State Univ., Pullman, WA. Evans, L.S., A. Citta, and S.C. Sanderson. 2012. Flowering branches cause injuries to secondyear main stems of Artemisia tridentata Nutt. Subspecies tridentata. Western North American Naturalist. 72 :447-456. Jensen, W.A. 1962. Botanical Histochemistry. W.H. Freeman, San Francisco, CA. 408 p. McCulloh, K.A., Sperry, J.S., Meinzer, F.C., Lachenbruch, B. & C., Arala, C. (2009). Murray’s law, the ‘Yarrum’ optimum, and the hydraulic architecture of compound leaves. New Phytologist 184: 234–244. McMahon, J.A. 1985. Deserts. Alfred A. Knopf, Inc., New York, NY. USDA plant guide, United States Department of Agriculture, Natural Resources Conservation Service, Plant Guide. http://plants.usda.gov/plantguide/pdf/pg artr2.pdf Welch, B. 2005. Big Sagebrush: a sea fragmented into lakes, ponds, and puddles. General Technical Report RMRS-GTR-144, Fort Collins, CO. West, N.E. 1999. Managing for biodiversity of rangelands. Pages 101-126 in W.W. Collins and C.O. Qualset, editors, Biodiversity in agroecosystems. CRC Press, Boca Raton, FL.
Theoretical and experimental design of efficient polycyclic aromatic hydrocarbon adsorbents Jeovanna Badson∗ Department of Chemistry and Biochemistry, Manhattan College Abstract. Phenanthrene, a polycyclic aromatic hydrocarbon, was examined with an array of lipophilic carboxylic acids to determine their binding ability. The quantum calculations for these interactions were completed using the PM3 semi-empirical Hamiltonian set of parameters from the Spartan ’16 software suite. The molecules used to interact with phenanthrene had carbon chains of varying lengths, different ring structures and varying numbers of ring structures. The acids considered were 3,3-Diphenylpropionic acid, phenylpropionic Acid, Linoleic acid, phenylacetic acid, diphenylacetic acid, Decanoic acid, Cyclohexaneacetic acid, stearic acid, α-phenylcyclopentylacetic acid, and 1-Naphthaleneacetic acid.
Introduction Polycyclic aromatic hydrocarbons (PAHs) are organic compounds containing only carbon and hydrogen atoms and are comprised of multiple aromatic rings. Aromatic rings are organic rings in which the electrons are delocalized. PAHs are uncharged and non-polar molecules. They are known to be ubiquitous carcinogenic molecules [1, 2, 3, 4, 5] created primarily from the incomplete combustion of organic matter. They are highly lipid soluble and therefore easily absorbed in the gastrointestinal tract of mammals. PAHs are often found in the surrounding air in the gas phase and as sorbet in aerosols. The simplest of such chemicals are naphthalene, which has 2 aromatic rings and anthracene and phenanthrene, each containing 3 rings.
Method Design and calculation of all molecules were done on Spartan ’16 [6]. The quantum chemical calculations of each molecule were executed using the PM3 semi-empirical Hamiltonian set of parameters, of ionization energies incorporated in the Spartan ’16 software suite. A major bottleneck in quantum chemical calculations is the evaluation of 4-center electron-electron repulsion integrals. Semi-empirical quantum chemical methods, such as PM3, offer a solution to this dilemma. Using experimental ionization potentials and electrons affinities in place of these integrals our calculations were performed at greater speeds. For example, see [7]. For a step-by-step procedure for creating and calculating the energies of the molecules refer to Spartan ’16 for Windows, Macintosh and Linux Tutorial, and User’s Guide [6]. A template of the prototype chemical reaction is shown in Fig. 1. Data presented in Table 1 include calculated heats of formation for each polycyclic aromatic hydrocarbon ‘target,’ the heat of formation of the phenanthrene ‘bullet,’ the heat of formation of the corresponding ‘target-bullet’ aggregate, and finally the overall heat of formation of the aggregate’s reaction. ∗
Research mentored by Joseph Capitani, Ph.D.
170
The Manhattan Scientist, Series B, Volume 5 (2018)
Badson
Figure 1. Template chemical reaction
Data and Discussion Table 1 displays the theoretical calculations and their respective experimental values. The theoretical values were calculated by finding the energy of the product complex between acid and phenanthrene and subtracting it from the energy of the reactants i.e. the phenanthrene and the energy of the acid. This can also be seen in Fig. 2 which shows the graphical representation of the theoretical reaction values in comparison to the experimental PAH capacity of the same lipophilic carboxylic acids. The discrepancy seen in the graph could be due to solvent effects, entropy effects and the difficulty of the experiment. Despite these differences, these calculations and experimental data are a precursor to more precise density functional calculations. To further optimize this experiment, the addition of solvent effects and density functional theory in the quantum mechanical calculations is necessary. Table 1. Semi-empirical calculations and experimental PAH capacities of the lipophilic carboxylic acids with phenanthrene. Energy of phenanthrene = 230.33 kJ/mol. 1 acid : 1 phenanthrene
Lipophilic carboxylic acids
Energy of carboxylic acids (kJ/mol)
Energy of product (kJ/mol)
3,3 - Diphenylproptonic Acid Phenyl propionic Acid Linoleic Acid Phenylacetic Acid Diphenylacetic Acid Decanoic Acid Cyclohexaneacetic Acid Stearic Acid α-Phenylcyclopentylacetic Acid 1- Naphthaleneacetic Acid
-200.19 -328.95 -598.17 -304.99 -190.73 -603.52 -520.06 -807.59 -354.12 -228.42
25.93 -108.27 -392.74 -83.99 49.96 -382.09 -317.76 -610.05 -134.34 26.68
PAH capacity E (mg PAH/g SiO2 ) (kJ/mol)
-4.11 -9.55 -24.8 -9.23 10.46 -8.8 -27.93 -32.69 -10.45 24.87
0.1890 ± 0.0567 0.1266 ± 0.0016 0.2807 ± 0.0301 0.1332 ± 0.0199 0.2144 ± 0.0211 0.1043 ± 0.0182 0.1291 ± 0.0020 0.1419 ± 0.0061 0.1650 ± 0.0914 0.1472 ± 0.0934
The phenanthrene molecule is planar as seen in Fig. 3. In addition, the electrostatic potential that surrounds the molecule, Fig. 4, illustrates how electronegative or electropositive various areas of the molecule are. The colors range from red to blue. Red is the most electronegative and blue
The Manhattan Scientist, Series B, Volume 5 (2018) 30
0.3
Erxn (kJ/mol)
20 10
0.2
0 -10 0.1
-20 -30 -40
0
1
2
3
4
5
6
7
8
9
10
171 PAH Capacity (mg PAH/ g SiO2)
Badson
Lipophilic Carboxylic Acids Erxn (kJ/mol)
PAH Capacity (mg PAH/ g SiO2)
Linear (Erxn (kJ/mol))
Figure 2. Comparison of the theoretical Erxn (kJ/mol) (red data) vs the experimental PAH capacity (mg PAH/ g SiO2 ) (blue data). Line represent a linear fit to the Erxn data.
the most electropositive. Fig. 4 is mostly green and yellow which means it is more electropositive. This makes sense since phenanthrene is a non-polar molecule.
Figure 3. Phenanthrene molecule
Figure 4. Electrostatic potential of phenanthrene molecule
The phenanthrene forms complexes with each of the lipophilic carboxylic acids and the theoretical calculations proves this. The “glue” that holds these complexes together is not π − π stacking; instead, it is most likely electrostatic quadrupole-quadrupole as in the arch-type aromatic molecule, benzene. We know this because the benzene rings in the lipophilic carboxylic acids and the phenanthrene do not stack on each other as if one were to “stack pancakes.” Instead, the benzene rings and the phenanthrene form a T-shaped structure. This is clearly seen in Figs. 5 and 6, showing the interaction of the phenanthrene with 3,3-diphenylpropionic acid. The phenanthrene forms the T-shape with one of the phenol groups of the acid. Fig. 7 shows linoleic acid as it also interacts with phenanthrene. In comparison, stearic acid interacts more strongly with phenanthrene, according to the Spartan [6] calculations. This could be due to the “kinks” in the linoleic structure that reduce the surface area that the phenanthrene is able to interact with. Fig. 8, displays the electrostatic potential of this interaction. This image indicates that the linoleic acid creates a pocket of sorts where the phenanthrene can insert itself.
172
The Manhattan Scientist, Series B, Volume 5 (2018)
Badson
This is another example of the quadruple-quadruple interaction of the phenanthrene, since it also forms a T-shape complex.
Figure 5. Phenanthrene molecule interacting with 3,3-diphenylacetic acid
Figure 6. Electrostatic potential of phenanthrene molecule interacting with 3,3-diphenylacetic acid
Figure 7. Phenanthrene molecule interacting with linoleic acid
Figure 8. Electrostatic potential of phenanthrene molecule interacting with linoleic acid
Fig. 9 illustrates another long chain carboxylic acid, decanoic acid. Decanoic acid interacts similarly to stearic acid in terms of available surface area of the phenanthrene to associate with (Fig. 10). However, due to its shorter chain, its theoretical energy is much lower than that of stearic acid. Fig. 11 illustrates the interaction between stearic acid and phenanthrene. It has one of the strongest theoretical reaction energies (Table 1). Fig. 12 further illustrates the hydrophobic interaction of these two molecules using the electrostatic potential. All other figures (Figs. 13-22) demonstrate the interaction of various acids with phenanthrene, each image showing the quadruple-quadruple interaction that we expect to be important for this reaction.
Badson
The Manhattan Scientist, Series B, Volume 5 (2018)
Figure 9. Phenanthrene interacting with decanoic acid
Figure 10. Electrostatic potential of phenanthrene interacting with decanoic acid
Figure 11. Phenanthrene interacting with stearic acid
Figure 12. Electrostatic potential of phenanthrene interacting with Stearic acid
Figure 13. Phenanthrene molecule interacting with phenylpropionic acid
Figure 14. Electrostatic potential of phenanthrene molecule interacting with phenylpropionic acid
173
174
The Manhattan Scientist, Series B, Volume 5 (2018)
Badson
Figure 15. Phenanthrene molecule interacting with phenylacetic acid
Figure 16. Electrostatic potential of phenanthrene molecule interacting with phenylacetic acid
Figure 17. Phenanthrene interacting with diphenylacetic acid
Figure 18. Electrostatic potential of phenanthrene interacting with diphenylacetic acid
Figure 19. Phenanthrene interacting with cyclohexaneacetic acid
Figure 20. Electrostatic potential of phenanthrene interacting with cyclohexaneacetic acid
Badson
The Manhattan Scientist, Series B, Volume 5 (2018)
Figure 21. Phenanthrene interacting with α-phenylcyclopentylacetic acid
Figure 22. Electrostatic potential of phenanthrene interacting with α-phenylcyclopentylacetic acid
Figure 23. Phenanthrene interacting with 1-naphthanleneacetic acid
Figure 24. Electrostatic potential of phenanthrene interacting with 1-naphthanleneacetic acid
175
Acknowledgments This work was supported by the Michael J. ’58 and Aimee Rusinko Kakos endowed chair in science. The author would like to thank Dr. Joseph Capitani, Dr. John Regan, and Dr. Jianwei Fan for all the helpful discussions and their continuous guidance and support throughout this research project.
References [1] Meyers, P. A. and Ishiwatari, R. (1993). “Lacustrine organic geochemistry - an overview of indicators of organic matter sources and diagenesis in lake sediments.” Organic Geochemistry. 20 (7): 867-900. doi:10.1016/0146-6380(93)90100-P. ISSN 0146-6380. Retrieved 2015-0204.
176
The Manhattan Scientist, Series B, Volume 5 (2018)
Badson
[2] Silliman, J. E., Meyers, P. A., Eadie, B. J., and Val Klump, J. (2001). “A hypothesis for the origin of perylene based on its low abundance in sediments of Green Bay, Wisconsin.” Chemical Geology. 177 (3-4): 309-322. Bibcode:2001ChGeo.177..309S. doi:10.1016/S00092541(00)00415-0. ISSN 0009-2541. Retrieved 2015-02-04. [3] Wakeham, S. G., Schaffner, C., and Giger, W. (1980). “Poly cyclic aromatic hydrocarbons in Recent lake sediments - II. Compounds derived from biogenic precursors during early diagenesis.” Geochimica et Cosmochimica Acta. 44 (3): 415–429. Bibcode:1980GeCoA..44..415W. doi:10.1016/0016-7037(80)90041-1. ISSN 0016-7037. Retrieved 2015-02-04. [4] Bostrom, C.-E., Gerde, P., Hanberg, A., Jernstrom, B., Johansson, C., Kyrklund, T., Rannug, A., Tornqvist, M., Victorin, K., and Westerholm, R. (2002). “Cancer risk assessment, indicators, and guidelines for polycyclic aromatic hydrocarbons in the ambient air.” Environmental Health Perspectives. 110 (Suppl 3): 451 - 488. doi:10.1289/ehp.02110s3451. ISSN 0091-6765. PMC 1241197Freely accessible. PMID 12060843. [5] Loeb, L. A. and Harris, C. C. (2008). “Advances in Chemical Carcinogenesis: A Historical Review and Prospective.” Cancer Research. 68 (17): 6863–6872. doi:10.1158/0008-5472.CAN08-2852. ISSN 0008-5472. PMC 2583449Freely accessible. PMID 18757397. [6] Spartan’ 16, Spartan’16 for Windows, Macintosh and Linux Tutorial and User’s Guide [pdf]. (2016). Irvine, CA: Wavefunction, Inc. [7] Jensen, F. Introduction to Computational Chemistry. John Wiley and Sons. 1999, p 88.
Carcinogenic nature of polyaromatic hydrocarbon binding to DNA Jacqueline DeLorenzoâ&#x2C6;&#x2014; Department of Chemistry and Biochemistry, Manhattan College Abstract. Polyaromatic hydrocarbons (PAHs) are a form of organic contaminant that are produced as a byproduct of incomplete combustion of fossil fuels. These molecules are believed to intercalate into the minor groove of DNA, preventing replication and therefore causing mutations and cancer. The goal of this research project was to determine the base pair specificity of PAHs when binding to DNA to learn more about their carcinogenic nature. DNA was adsorbed onto silica gel and exposed to a solution of phenanthrene, a PAH representative; the decrease in concentration of PAH caused by DNA was determined via fluorescence spectroscopy. DNA on silica gel was found to effectively remove PAH from solution and the absorbance was found to be time-dependent with the majority of PAH being absorbed in the first 2.5 hours of exposure. The DNA oligonucleotide strands tested were ATAT, GCGC, and the Dickerson sequence; all designer DNA strands yielded relatively the same amount of PAH absorbed. More specific base pair sequences of DNA will need to be tested in order to determine the specificity of polyaromatic hydrocarbons when interacting with DNA.
Introduction Despite public knowledge, organic contaminants have become a widespread and exceedingly dangerous health concern in the United States. Organic contaminants can be toxic at incredibly low concentrations and some are slightly more difficult to remove from water sources than their inorganic counterparts. Water treatment facilities remove these organic contaminants from water supplies by means of coagulation, precipitation, or adsorption. Although a small amount of these contaminants may pass through purification and be ingested via drinking water, organic contaminants often enter the body via inhalation from air pollution. Among the vast majority of these dangerous organic contaminants are polyaromatic hydrocarbons. Polyaromatic hydrocarbons are organic molecules composed of several fused aromatic rings. These compounds, also knowns as PAHs, are usually colorless solids that are produced in extremely small amounts as a byproduct of incomplete combustion reactions. Natural combustion of trees and brush as well as anthropogenic sources such as coal, gasoline, and oil produce polyaromatic hydrocarbons upon combustion (Fig. 1) [1]. These small concentrations of organic pollutants are evaporated, along with the produced carbon dioxide and water, into the atmosphere. PAHs are also produced in industry, used as pesticides, which are also known to possess toxicity and possible carcinogenicity. Polyaromatic hydrocarbons, PAHs, are toxic and carcinogenic molecules. Because they have very low aqueous solubility and are highly lipophilic molecules, PAHs are readily absorbed by organisms and stored in body fat. Once inside the body, PAHs often bind to cellular proteins and DNA resulting in biochemical disruptions and cell damage [1]. Due to their hydrophobic, nonpolar structure, PAHs are often confused as steroid hormones within the body, making them endocrine disruptors [2]. These molecules are able to pass through the lipid bilayer of the nucleus â&#x2C6;&#x2014;
Research mentored by John Regan, Ph.D.
178
The Manhattan Scientist, Series B, Volume 5 (2018)
DeLorenzo
Figure 1. Known sources of polyaromatic hydrocarbons [1]
and interact with DNA within the cell. PAH molecules act as an intercalating agent by interacting with the nonpolar nitrogenous bases of DNA in the minor groove (Fig. 2). This prevents the DNA from being able to properly unwind for replication, therefore causing mutations. This intracellular damage results in developmental malformations, tumors, and cancer. Phenanthrene, one example of a polyaromatic hydrocarbon, was used in this research project to create the stock solution of PAH (Fig. 2). Phenanthrene has an oral LD50 (Lethal Dose, 50%) of 750 mg/kg in mice and has yielded mutations in human lymphoblast cells when exposed to 9 Âľg/mL of phenanthrene [3].
Figure 2. Left, structure of the minor groove of DNA. Right, structure of the PAH molecule phenanthrene.
The goal of this research project was to remove PAHs from an aqueous solution by exploiting this attraction to DNA. To do this, DNA had to be adsorbed onto a solid surface via solid phase extraction. This is done by relying on the reversible interactions between DNA and a solid support, such as silica gel (SiO2 ) [4]. Interactions between DNA phosphate groups and surface silanol
DeLorenzo
The Manhattan Scientist, Series B, Volume 5 (2018)
179
groups occur via hydrogen bonds; hydrophobic interactions occur between nitrogenous bases and the hydrophobic regions of silica gel (Fig. 3) [5]. A chaotropic salt, sodium perchlorate, was added to the solution due to the ability of chaotropic salts to drive DNA binding to silica gel. Adding electrolytes to the solution also shields and depresses electrostatic interaction between DNA and silica to further facilitate adsorption. Amberlite resin was tested as a possible solid support for DNA as well; however, amberlite had an extremely high affinity for absorbing PAH alone and therefore could not be used to measure the amount of PAH absorbed by DNA. This made it more difficult to test DNAâ&#x20AC;&#x2122;s affinity for PAH.
Figure 3. Hydrogen bond interactions between silanol groups of silica gel and the negative phosphate groups of DNA.
To test the adsorption of DNA onto silica, the absorption of a solution of DNA was measured before and after exposure to silica gel using a UV spectrophotometer. Once DNA was successfully bound to silica gel, the DNA-SiO2 complex was exposed to a solution with extremely low concentration of phenanthrene. PAHs are present in nature at extremely low concentrations measures in ppm or ppb; therefore, an extremely dilute concentration of PAH was used in the lab in order to mimic this condition. The analysis of the amount of PAH absorbed from solution was done using fluorescence spectroscopy. By measuring the fluorescence of the PAH solution before and after being exposed to the DNA-SiO2 complex, the relative amount of PAH absorbed could be determined. Other factors such as quantity of DNA and SiO2 used, concentrations, and kinetic effects could also be determined using this method. The ultimate goal of this project was to determine the base pair specificity, if any, of polyaromatic hydrocarbons. It is known that PAHs bind to the minor groove of DNA; however, they may have a greater affinity to bind to certain sequences of nitrogenous bases. To test this theory, isolated sequences of specific bases were adsorbed to silica and tested against a solution of PAH. If certain base pair sequences resulted in a greater absorption of PAH than others, then PAH would be shown to have specificity. This knowledge could contribute to further research of the mechanism of action of PAHs in the body as a carcinogen and their impact on human health.
180
The Manhattan Scientist, Series B, Volume 5 (2018)
DeLorenzo
Materials and Methods Preparation of DNA-SiO2 complex 10 mg of calf thymus DNA (45% GC character) was initially dissolved in 5 mL of water. This solution was stirred for approximately 5 minutes until the DNA was completely dissolved. 10 mL of 0.6 M sodium perchlorate, a chaotropic salt, was then added to the DNA solution. It was found that the concentration of electrolyte in solution did not have a significant impact on the experiment, but DNA had the best adsorption to silica at relatively low concentrations, rather than extremely high concentrations. It was determined that the most efficient solid support for this experiment was Sigma-Aldrich fine silicon dioxide particles (4.61 m2 /g). Once the DNA solution had been stirred with sodium perchlorate for an additional 5 minutes, 10 mg of silicon dioxide was added to the solution. The silicon dioxide was mixed with the DNA solution for 45 minutes to ensure that the silicon dioxide had become saturated with DNA. The saturated SiO2 was then centrifuged to allow for separation and the excess solution was discarded. The DNA-SiO2 complex was washed with 0.6 M sodium perchlorate 3 times to discard any impurities. It was then removed from the test tube and left overnight to dry. Preparation of the PAH stock solution A 1.5 ppm solution of PAH was prepared to be used as a stock solution to test the absorption of PAH by DNA. The solution was prepared by weighing out 1.5 mg of phenanthrene. Phenanthrene was then fully dissolved in 5 mL of pure ethanol. This solution was transferred to a 1 L volumetric flask and diluted with distilled water. This solution was then left to stir for 6-7 hours to allow for total dissolution of the solid. The PAH solution was stored in dark glassware and kept in the dark, due to the ability of sunlight to hasten the decay process. A new stock solution was made every several weeks due to natural decay of phenanthrene over time. Exposure of DNA-SiO2 complex to PAH solution The dry DNA-SiO2 complex was then ready to be exposed to the PAH solution. The 10 mg of DNA-SiO2 was placed in a beaker with 5 mL of the 1.5 ppm phenanthrene solution. This solution was stirred for varying amounts of time ranging from 45 minutes to overnight. After the DNA-SiO2 complex had been exposed to PAH for the determined amount of time, the solution was placed in a test tube and centrifuged again to allow for separation of the DNA-SiO2 complex and the PAH solution. The solution was then pipetted into a separate test tube and analyzed using fluorescence spectroscopy. Designer oligonucleotides Once the procedure for the proper treatment and exposure of DNA to PAH was perfected, specific base pair sequences of DNA were then used to determine the specificity of PAHs to DNA. The first repeating sequences tested were as follows: A-T-A-T, G-C-G-C, and the Dickerson sequence, C-G-C-G-A-A-T-T-C-G-C-G. These designer oligonucleotide sequences came in a single-strand
DeLorenzo
The Manhattan Scientist, Series B, Volume 5 (2018)
181
powdered form and had to first be annealed to generate a double-stranded oligonucleotide before they could be used. The annealing process is as follows: A 200 µM solution of each designer nucleotide solution was prepared. Approximately 3000 µL of solution were added to the dry DNA sequences and varied according to the exact number of nmol of each sequence. The DNA solution was composed of 4 parts distilled water to once part annealing buffer (∼2 mL H2 O, ∼600 µL buffer) [6]. The annealing buffer was purchased from Sigma-Aldrich and composed of 10 mM tris(hydroxymethyl)aminomethane, 50 mM NaCl, and 1 mM EDTA (Ethylenediaminetetraacetic acid) [6]. The oligonucleotide solution was then pipetted to a round bottom flask placed in a hot water bath of 95◦ C for 4 minutes to allow for annealing to take place (Fig. 4). The entire DNA solution (∼3 mL) was then stirred with 10 mg SiO2 and the same procedure used for the calf thymus DNA was then repeated.
Figure 4. Annealing reaction between oligonucleotides.
Results and Discussion The amount of phenanthrene remaining in solution before and after being exposed to DNA was determined by fluorescence spectroscopy. Fluorescence spectroscopy involves passing a beam of light through an aliquot of sample in order to excite its electrons [7]. Once the electrons have become excited they emit light, this is known as fluorescence. Aromatic compounds, such as PAHs, fluoresce when their electrons become excited. We were therefore able to determine how much PAH remains in solution by measuring the absorbance of fluorescence. The most efficient procedure for absorbing PAH from an aqueous solution was determined by measuring the absorbance of PAH that had been exposed to the DNA-SiO2 compound under varying conditions and comparing it to a stock solution of PAH that had not been altered. It was determined that one of the most important influences on the absorption of polyaromatic hydrocarbons was time. Kinetic trials of the experiment were run to show how the absorbance of PAH changed over time. Several reactions were set up following the same procedure detailed above; however, the amount of time that the DNA-SiO2 complex was exposed to the PAH solution varied. The time amounts given were as follows; 30 minutes, 1 hour, 2 hours, and overnight (∼18 hours). The absorbance of phenanthrene by DNA on silicon dioxide was first compared to that of
182
The Manhattan Scientist, Series B, Volume 5 (2018)
DeLorenzo
silicon dioxide alone. As shown in Fig. 5, silicon dioxide does have the ability to absorb PAH from solution; however, silicon dioxide does not absorb as much PAH as DNA adsorbed onto silicon dioxide. The amount of PAH absorbed by SiO2 was also found to remain relatively constant over time while the absorbance of PAH by DNA was found to increase with time. The ability of silicon dioxide to absorb PAH alone is most likely due to hydrophobic interactions between polyaromatic hydrocarbons and the hydrophobic silicon regions of silicon dioxide.
Figure 5. The absorbance of PAH by DNA on silicon dioxide compared to that of silicon dioxide alone.
Figure 6. The absorbance of PAH by DNA on SiO2 at varying time intervals.
Fig. 6 shows the absorbance of PAH from solution by DNA on silica gel over time. As the amount of time that PAH is exposed to DNA increases, the more PAH is absorbed. When exposed to the DNA-SiO2 complex overnight, the highest amount of PAH was absorbed from solution. Fig. 7 shows a graphical representation of the percent reduction of PAH by DNA-SiO2 over time. This demonstrates a logarithmic increase in the percent reduction, with an extreme increase in the amount of PAH absorbed within the first about 2.5 hours of exposure. Finally, the different strands of designer oligonucleotide strands of specific base pair sequences were tested to determine the base pair specificity of polyaromatic hydrocarbons. As shown by Fig. 8, all the annealed oligonucleotide strands did effectively absorb PAH from solution. However, there was not a significant difference in the amount that each oligonucleotide absorbed. The only slight difference observed was the amount of PAH absorbed by the Dickerson sequence. This oligonucleotide strand absorbed slightly more PAH from solution; however, this may be due to the
DeLorenzo
The Manhattan Scientist, Series B, Volume 5 (2018)
183
larger size of the oligonucleotide strand.
Figure 7. Percent reduction of the amount of PAH absorbed from solution over time.
Figure 8. The absorbance of PAH by several designer oligonucleotide strands.
Conclusions
DNA was successfully loaded on to powdered silicon dioxide. This DNA-SiO2 complex effectively removed PAH from solution. The DNA on SiO2 was shown to absorb more PAH than silicon dioxide alone and the process was found to be time-dependent. When several different oligonucleotide strands were tested for their absorbance of PAH, they all absorbed PAH from solution efficiently, and yielded very similar results in the amount of PAH absorbed. This indicates
184
The Manhattan Scientist, Series B, Volume 5 (2018)
DeLorenzo
that the technique for preparing and using these DNA strands was successful; however, more base pair sequences of DNA need to be tested in order to determine whether a specific sequence has a better affinity for absorbing PAH. This information on the specificity of PAH can be used for further research into the carcinogenic nature of PAH and its specific mechanism of action when interacting with DNA.
Acknowledgments This work was financially supported by a donation from Kenneth G. Mann ’63. The author would like to thank her research faculty advisor Dr. John Regan for the advisement and support.
References [1] Abdel-Shafy, H. M. and Mona S. M. “A review on polycyclic aromatic hydrocarbons: Source, environmental impact, effect on human health and remediation.” Egyptian Journal of Petroleum, Volume 25, Issue 1, March 2016, Pages 107-123 [2] Munõz, B. and Albores, A. “DNA Damage Caused by Polycyclic Aromatic Hydrocarbons: Mechanisms and Markers,” in Selected Topics in DNA Repair. C. C. Chen, IntechOpen. October 26, 2011. DOI: 10.5772/22527. https://www.intechopen.com/books/selected-topicsin-dna-repair/dna-damage-caused-by-polycyclic-aromatic-hydrocarbons-mechanisms-andmarkers [3] “Phenanthrene.” U.S. National Library of Medicine, National Institutes of Health, 5 Sept. 2017. toxnet.nlm.nih.gov/cgi-bin/sis/search/a?dbs+hsdb:@term+@DOCNO+2166 [4] Vandeventer, P. E. “Multiphasic DNA Adsorption to Silica Surfaces under Varying Buffer, pH, and Ionic Strength Conditions.” The Journal of Physical Chemistry B 2012 116 (19), 56615670. 26 April. 2012. [5] Shi B., Shin Y. K., Hassanali A. A., and Singer S. J. “DNA Binding to the Silica Surface.” Current Neurology and Neuroscience Reports., U.S. National Library of Medicine, 27 Aug. 2015. [6] “Protocol for Annealing Oligonucleotides.” Sigma-Aldrich, MilliporeSigma, 2018, www. sigmaaldrich.com/technical-documents/protocols/biology/annealing-oligos.html [7] Hooijschuur, J. H. “Fluorescence Spectrometry.” Fluorescence Spectrometry - Fluorescence Spectrometry - Chromedia, Chromedia Analytical Sciences , www.chromedia.org/chro-media
Removing polycyclic aromatic hydrocarbons (PAHs) by adsorption onto silica gels treated with lipophilic carboxylic acids Jessi Dolores∗ Department of Chemistry and Biochemistry, Manhattan College Abstract. PAHs are a group of non-polar molecules composed of two or more fused aromatic rings. Generated primarily from the incomplete combustion of organic materials, PAHs are both mutagenic and carcinogenic, making them a public health issue and an environmental pollutant. This project studies the removal of PAH from water by using silica gel (SiO2 ) treated with lipophilic carboxylic acids. The addition of lipophilic carboxylic acids increases the silica gel surface’s hydrophobicity for better adsorption of PAH molecules. UV/visible absorption spectroscopy was used to determine the amount of the lipophilic carboxylic acid loaded on the silica gel. Using fluorescence spectroscopy, the change of fluorescence intensity of PAH during the adsorption by treated silica gel was measured and converted to the mass of PAH adsorbed using a calibration curve. The PAH adsorption capacity (mg PAH/g SiO2 ) by various treated silica gels were calculated, and the effect of structures and functional groups of the lipophilic carboxylic acids on the PAH adsorption capacity was discussed.
Introduction Polycyclic aromatic hydrocarbons (PAHs), such as phenanthrene, are a group of non-polar molecules composed of two or more fused aromatic rings. PAHs are produced by incomplete combustion reactions and daily activities, such as emissions produced by cars and smoke from cigarettes; hey are both mutagenic and carcinogenic. The Environmental Protection Agency (EPA) has listed 16 PAHs as priority containments in the ecosystem. However, they are chemically stable and very difficult to be removed from aqueous solutions. Current methods in removing PAHs involve activated carbon, Biochar, and modified clay mineral [1]. The advantages to these methods are high efficiency and PAH adsorption capacity. The disadvantages of these methods are high cost and that adsorption capacity depends on several parameters: pH, temperature, and solubility. Due to its high surface area and low cost, silica gel (SiO2 ) modified with lipophilic carboxylic acids was used in this project to adsorb PAH compounds such as phenanthrene from aqueous solutions. Lipophilic carboxylic acids were used to change the silica gel surface from polar to nonpolar, which allowed for PAH molecules to have a hydrophobic domain to bind to. The interaction between the lipophilic carboxylic acid and PAH was through the hydrogen bonding between the silanol groups (Si-OH) on the silica gel surface and the carboxyl groups (-COOH) of the lipophilic carboxylic acids. Using UV/vis absorption spectroscopy, the amount of lipophilic carboxylic acid bonded on the surface of silica gel was determined. The treated silica gel was then soaked with a known aqueous phenanthrene solution, and the change of the fluorescence intensity of phenanthrene during the adsorption process was measured with fluorescence spectroscopy. Using the fluorescence calibration curve, the fluorescence intensity was converted to the concentration of ∗
Research mentored by Jianwei Fan, Ph.D.
186
The Manhattan Scientist, Series B, Volume 5 (2018)
Dolores
phenanthrene adsorbed onto the silica gel. Eleven different lipophilic carboxylic acids were used to modify the silica gel surface. The effect of the structures and functional groups of the acids on the PAH adsorption capacity is discussed.
Chemicals All chemicals and solvents were purchased from Sigma-Aldrich Corp. (Milwaukee, WI) and used without further purification. Silica gel (60 A) was obtained from Dynamic Absorbents Inc. (Norcross, GA). Deionized nanowater was used throughout. Phenanthrene is used as a model compound for PAH:
Experimental Loading lipophilic carboxylic acids on preheated silica gel To begin the loading of the lipophilic carboxylic acids, an MTBE based stock solution was made for each carboxylic acid used by dissolving 0.2 g of each of the carboxylic acids in 25 mL of MTBE. Before loading, all silica gels used during the procedure were oven heated at 150â&#x2014;Ś C for 2 hours. Once cooled, 1 g of the preheated silica gel was mixed in with the carboxylic acid stock solution and left to stir overnight. The top solution was removed and the solid residue was washed with 10 mL MTBE. The resulting solid residue was left to dry for future use. Before and after the reaction with silica gel, an aliquot was taken from the mixture to determine the amount of the acid coated on the silica gel by UV spectrum with an Agilent 8453 UV/visible photodiode array spectrophotometer. Measuring the adsorption of PAH by treated silica gel Once the treated silica gel was dried and collected, a stock solution of phenanthrene was made by using 1.5 mg of solid phenanthrene dissolved in 5 mL absolute ethanol. The mixture was diluted into 1L of distilled water, stirred for 6 hours and stored in a brown bottle. The fluorescence intensity of the stock phenanthrene was measured at 365 nm with a Photon Technology International (PTI) spectrofluorometer equipped with a 1.0 cm quartz cell. The excitation wavelength was set to 251 nm, and the emission spectra were taken in the range of 275 nm to 400 nm. In order to calculate the concentration of phenanthrene at any point during the adsorption, a calibration curve was made by using a series dilution of the stock phenanthrene and plotting the measured fluorescence intensity vs. the concentration (ppm) of phenanthrene. 0.10 g of the treated silica gel was soaked in 50 mL 1.5 ppm stock phenanthrene solution. The fluorescence intensity reading was taken during of the adsorption process at 0, 1, 2, and 3 hours. Using the calibration curve, the change of fluorescence intensities was converted to the change in concentration of phenanthrene by adsorption on the treated silica gel.
Dolores
The Manhattan Scientist, Series B, Volume 5 (2018)
187
Results Spectroscopic data of lipophilic carboxylic acids used To determine the amount of lipophilic acids loaded on the silica gel, the spectroscopic data for each lipophilic carboxylic acid was measured. Table 1 is a summary of the spectroscopic data (maximum absorption wavelength and molar extinction coefficient) for the carboxylic acids that have cyclic structures in their respective molecules. Spectroscopic data for lipophilic carboxylic acids that did not have cyclic structures could not be determined. For these compounds, the amount of lipophilic carboxylic acid loaded onto the surface of silica gel was determined via the change of weight between the treated silica gel and the plain silica gel. Table 1. Measured spectroscopic data for some cyclic lipophilic carboxylic acids. Lipophilic carboxylic acid
Îťmax (nm)
max (Mâ&#x2C6;&#x2019;1 cmâ&#x2C6;&#x2019;1 )
3,3-Diphenylpropionic acid (3,3-DPP) 2,3-Diphenylpropionic acid (2,3 DPP) Diphenylacetic acid (DPA) Phenylacetic acid (PAA) 1-Naphthaleneacetic acid (NAA) Îą-Phenylcyclopentyl acetic acid (ÎąPCA)
204 203 203 207 223 206
2.59Ă&#x2014;104 1.28Ă&#x2014;104 2.45Ă&#x2014;104 8.90Ă&#x2014;103 5.59Ă&#x2014;104 1.17Ă&#x2014;104
Loading the lipophilic carboxylic acids onto silica gel Fig. 1 is the UV absorption spectrum of 2,3-Diphenylpropionic acid before (a) and after (b) the overnight reaction with silica gel. The decrease in absorbance in the UV region is used to calculate the moles of the acid loaded onto the surface of silica gel. Using Beerâ&#x20AC;&#x2122;s law, the difference 1.0
Absorbance
a b
Figure 1. UV/vis absorption of spectra of 2,3 diphenylpropionic acid before (a) and after (b) the soaking of silica gel.
0.5
0.0 190
200
210
220
230
240
Wavelength (nm)
in concentration of the lipophilic carboxylic acid was calculated. The difference in concentration multiplied by the volume of the solution gives the moles of carboxylic acids loaded onto the gram of silica gel (Table 2). The percentages of the acid molecules loaded are calculated based on the initial mass of the acid used (200 mg). Measuring PAH adsorption capacity by treated silica gels Fig. 2 is the fluorescence calibration curve of phenanthrene (PAH) used to convert the change on fluorescence intensity into the change of PAH concentration.
188
The Manhattan Scientist, Series B, Volume 5 (2018)
Dolores
Table 2. Millimoles of the acid and % of acid loaded on per gram of SiO2 . Lipophilic carboxylic acids
mmole of acid in 1g SiO2
% of acid loaded
Decanoic acid (C10 H20 O2 )
0.399
33.7
Linoleic acid (C18 H32 O2 )
0.092
12.9
Stearic acid (C18 H36 O2 )
0.312
44.1
Cyclohexylacetic acid (C8 H14 O2 )
0.384
26.5
Cyclopentylphenyl Acetic acid (C13 H16 O2 )
0.573
5.69
1-Naphthalene Acetic acid (C12 H10 O2 )
0.014
1.27
Phenylacetic acid (C8 H8 O2 )
0.109
7.18
Diphenylacetic acid (C14 H12 O2 )
0.0323
3.38
Phenylpropionic acid (C9 H10 O2 )
0.126
9.28
3,3-Diphenylpropionic acid (C15 H14 O2 )
0.021
2.35
2,3- Diphenylpropionic acid (C15 H14 O2 )
0.135
15.2
2
30
y = 184003x + 896.35 R² = 0.9925
25
Counts [Ă&#x2014; 105/s]
Fluorescence Intensity (x 10000)
35
20 15 10
1
5
0
0 0
0.5
1
1.5
Concentration of Phenanthrene (ppm)
Figure 2. Fluorescence calibration curve for phenanthrene (PAH)
320
330
340
350
360
370
380
390
400
410
420
430
440
450
Wavelength (nm)
Figure 3. Fluorescence spectra of phenanthrene before the addition of treated silica gel (orange) and at different soaking times: 1h (blue), 2h (black), 3h (green).
Fig. 3 is the fluorescence spectrum of phenanthrene before the addition of treated silica gel and after the addition of treated silica gel at 1 h, 2 h, and 3 h. The mg of phenanthrene adsorbed
Dolores
The Manhattan Scientist, Series B, Volume 5 (2018)
189
by the silica gel was obtained by converting the change of fluorescence intensity to change of concentration (using Fig. 2) followed by converting the change of concentration into mg. As the soaking time increased, the fluorescence intensity of the phenanthrene went down. This indicated that the treated silica gel had absorbed some of the phenanthrene. To measure exactly how much phenanthrene was absorbed, a capacity value was calculated. A capacity value is a measurement of how many milligrams of phenanthrene were absorbed by one gram of the treated silica gel (mg/g) and was calculated by the expression mg of PAH ∆C(in mg/L) × V (in L) , = g of SiO2 g of SiO2 used where ∆C is the change of concentration of phenanthrene, and V is the volume of the solution. Table 3 is the calculated phenanthrene adsorption capacities by various treated-silica gels. All adsorption capacity values were obtained from the average of at least three parallel measurements. After calculating the capacity values, binding ratios were calculated to see how many moles of the carboxylic acid would be needed to bind to one mole of phenanthrene. Table 4 is the molar ratio of lipophilic carboxylic acid bound to phenanthrene.
Discussion The loading of the carboxylic acids on the silica gel surface According to the literature [2], the concentration of free silanol groups was reported to be 2.54 ± 0.08 mmol/g. None of the lipophilic carboxylic acids bonded completely onto all of the free silanol groups in this experiment. One of the possible reasons is that water re-adsorbing onto the silica gel prior to loading of carboxylic acid. Silica gel was preheated in an oven at 150◦ C in order to break the silanol-water bond in silica gel prior to loading lipophilic carboxylic acids. However, Fourier transform infrared spectroscopy (FTIR) tests on plain pre-heated silica gel showed peaks at 3750 cm−1 and at 3400 cm−1 indicating a silanol and water band respectively. The water stretch band (3400 cm−1 ) indicated that there was water present in the silica gel and therefore occupied space on the surface that the lipophilic carboxylic acid could have bonded on to. This can be traced back to the sample handling. During handling, samples were exposed to the atmosphere and the solvent used to make the stock lipophilic carboxylic acid solution was not anhydrous. The second reason of low loading could be due to the small pore size of the silica gel. The average pore size of the silica gel used was 60 Å (6 nm). Due to the large size of the lipophilic carboxylic acid, it is impossible for the carboxylic acid molecules to penetrate some of the smaller silica gel pores. When analyzing the percentage of acid loaded by mass on Table 2, a trend appears. The bulkier the lipophilic carboxylic acid, the lower percentage of acid loaded by mass. An example of this would be comparing diphenylacetic acid vs. phenylacetic acid. Diphenylacetic acid loaded 3.38% while phenylacetic acid loaded 9.28%. As stated earlier, there were carboxylic acids used in this project that could not be testing spectroscopically. Compounds such as decanoic
190
The Manhattan Scientist, Series B, Volume 5 (2018)
Dolores
Table 3. PAH adsorption capacities for various treated silica gels and an untreated silica gel. Lipophilic carboxylic acids used to modify silica gel
Chemical Structures
PAH adsorption capacity (mg PAH/g silica gel)
Decanoic acid (C10 H20 O2 )
0.104 ± 0.018
Linoleic acid (C18 H32 O2 )
0.281 ± 0.030
Stearic acid (C18 H36 O2 )
0.142 ± 0.006
Cyclohexylacetic acid (C8 H14 O2 )
0.129 ± 0.020
Cyclopentylphenyl Acetic acid (C13 H16 O2 )
0.165± 0.091
1-Naphthalene Acetic acid (C12 H10 O2 )
0.147 ± 0.093
Phenylacetic acid (C8 H8 O2 )
0.133 ± 0.020
Diphenylacetic acid (C14 H12 O2 )
0.214 ± 0.021
Phenylpropionic acid (C9 H10 O2 )
0.127 ± 0.002
3,3 Diphenylpropionic acid (C15 H14 O2 )
0.190 ± 0.057
2,3-Diphenylpropionic acid (C15 H14 O2)
0.283 ± 0.037
Untreated silica gel
0.012
Dolores
The Manhattan Scientist, Series B, Volume 5 (2018)
191
Table 4. Binding ratios for selected lipophilic carboxylic acids to phenanthrene. Lipophilic acid used Diphenylacetic acid 2,3-Diphenylpropionic acid 3,3-Diphenylpropionic acid Cyclopentylphenyl Acetic acid 1-Naphthalene Acetic acid
Lipophilic acid : phenanthrene (millimoles) 1.7 : 1.6 : 1.5 : 1.8 : 1.6 :
1 1 1 1 1
acid, linoleic acid, steric acid and cyclohexyl acetic acid had to be weighed before and after the addition of silica gel into their respective stock solution. The percentage of acid loaded is much higher than the rest of the compounds because the percentage is taking into account the residue of carboxylic acid in the mixture. The binding force between lipophilic acids and phenanthrene All 11 lipophilic carboxylic acids tested bonded via hydrogen bonding between the silanol group on the silica gel and the carboxylic acid group of the lipophilic carboxylic acid as shown in Fig. 4.
Figure 4. Hydrogen bonding between silica gel and the lipophilic carboxylic acid.
Figure 5. Binding between linoleic acid and phenanthrene.
Figure 6. Binding between 3,3-diphenylpropionic acid and phenanthrene.
The results in Table 3 confirm that soaking the preheated silica gel in carboxylic acids makes a difference in the capture of PAH. The first three carboxylic acids on Table 3 are linear carboxylic acids with long chains. Even though all of the carboxylic acids have linear chains, they all have different amounts of carbons and double bonds. As the amount of carbons increases, the capacity for that treated silica gel to capture phenanthrene increases due to the increase of hydrophobicity. When comparing decanoic acid vs. steric acid and linoleic acid, decanoic acid has ten carbons and has the lowest capacity among the three. Linoleic acid and steric acid each have eighteen carbons and have higher capacities than decanoic acid. Among those two, linoleic acid has the highest capacity because it is the only linear carboxylic acid tested that has double bonds present. The other lipophilic carboxylic acids tested varied in the number of cyclic structures on their hydrophobic chains. Like with the linear carboxylic acids, capacities were affected by structure. The more aromatic rings present, the higher the capacity to capture PAH due to increased π − π
192
The Manhattan Scientist, Series B, Volume 5 (2018)
Dolores
interactions. When comparing phenyl acetic acid vs diphenylacetic acid and phenylpropionic acid vs 3,3-diphenylpropionic acid, the added aromatic ring increased the PAH adsorption capacity in both cases. The binding ratio between lipophilic acids and phenanthrene From Table 4, it was determined that the lipophilic carboxylic acids bind to phenanthrene in an approximate 1:1 ratio. It is thought that the reason why the ratios are not perfectly 1:1 is due to the bulkiness of the phenanthrene. Instead of the phenanthrene molecules binding onto sites next to each other, it might be that these molecules are binding in another sequence.
Conclusions This work showed that silica gel treated with lipophilic carboxylic acids had the greater PAH adsorption capacity compared with untreated silica gel. In addition, the adsorption capacity is dependent on the structure and the functional group of the acids: 1. Increasing the number of carbon atoms in the linear lipophilic carboxylic acid increases the PAH adsorption capacity (stearic acid vs. decanoic acid); 2. Increasing the number of unsaturated bonds increases the PAH adsorption capacity (linoleic acid vs. stearic acid); 3. Increasing the number of aromatic rings increases the PAH adsorption capacity (diphenylacetic acid vs. phenylacetic acid; 3,3-diphenylpropionic acid vs. phenylpropionic acid). It is due to the additional π-π interaction between the lipophilic acids and PAH molecules.
Acknowledgments This work was supported by the School of Science Summer Research Scholars Program. The author would like to thank Drs. John Regan, Joseph Capitani and Alexander Santuli for their input and advice throughout the project, and Ms. Jeovanna Badson for her help in the lab during the summer.
References [1] Smol, M. and Włodarczyk-Makuła, M., “The Effectiveness in the Removal of PAHs from Aqueous Solutions in Physical and Chemical Processes: A Review.” Polycyclic Aromatic Compounds, vol. 37, no. 4, Nov. 2016, pp. 292-313. [2] Yoshinaga, K., Yoshida, H., Yamamoto, Y., Takakura, K., and Komatsu, M. “A Convenient Determination of Surface Hydroxyl Group on Silica Gel by Conversion of Silanol Hydrogen to Dimethylsilyl Group with Diffuse Reflectance FTIR Spectroscopy.” Journal of Colloid and Interface Science, vol. 153, no. 1, 1992, pp. 207-211
An insoluble chemical reducing agent: Application to Cr(VI) removal Nicholas Dushaj∗ Department of Chemistry and Biochemistry, Manhattan College Abstract. Cr(VI) is a toxic substance, with many physiological consequences such as cancer and chronic bronchitis. Its removal from drinking water, via numerous methods, has been an important endeavor for environmental engineers. Sodium borohydride (NaBH4 ) is a reducing reagent used in organic synthesis, which can also serve as an electron donor in buffered homogeneous solutions to reduce Cr(VI) to Cr(III), a necessary dietary supplement. In this project, we utilized an insoluble form of the reducing agent where a borohydride ion is attached to a polystyrene resin bead by ionic interactions. This material, MP-borohydride, can be removed from the water source after the chromate reduction has occurred. Another goal is to also identify an environmentally friendly buffer solution that will stabilize pH and not add to the chemical footprint of the reaction.
Introduction Hexavalent chromium, or Cr(VI), is a well-known significant health risk in drinking water. Cr(VI) is a byproduct of many industrial processes and leather tanning, among others [1]. It is known to have strong oxidizing properties, which can have dermal and oral effects on humans from drinking water. This can lead to an increase in reactive oxygen species that cause cancer, alter gene expression, and affect respiratory function, among other physiological consequences [2]. The US Environmental Protection Agency (EPA) has set a standard of 0.8 µM of chromium ions for drinking water [3]. One method to remove Cr(VI) is through a reduction-oxidation (redox) reaction, in which a reducing agent undergoes an electron transfer and converts Cr(VI) to Cr(III). Sodium borohydride, a common organic chemistry reducing agent, is used throughout this research endeavor. The reaction below shows the reduction of Cr(VI) to Cr(III) using sodium borohydride with sodium borate as the buffer: O -
O
Cr
O O
Cr
NaBH4
O-
Cr(OH)3
+
NaB(OH)4
O O
O
NaO
B
B
O
O
Cr(VI) O
B
Cr(III)
O B
ONa
sodium borate
Figure 1. General oxidation-reduction (redox) reaction from Cr(VI) to Cr(III) ∗
Research mentored by John Regan, Ph.D.
194
The Manhattan Scientist, Series B, Volume 5 (2018)
Dushaj
The problem with using sodium borohydride in a homogeneous environment is the difficulty to remove the byproduct B(OH)4 from the reaction, due to its tremendously high affinity to water, and it contributes to the pollution profile of wastewater. Borohydride ions are corrosive which can have a similar hazardous outcome as Cr(VI). Therefore, it is essential to effectively reduce Cr(VI) with borohydride ions without contaminating the water. Merely using the borohydride reagent alone would result in a rapid hydrolysis with the formation of hydroxyborohydrides, which is a form of boric acid. The mechanism of the reduction of Cr(VI) to Cr(III) associated with sodium borohydride is unknown. The behavior of a borohydride ion in an acidic aqueous environment [4] shown below (Fig. 2):
Figure 2. Overall hydrolysis mechanism of borohydride ion [5]
The large spheres with the reactant represent Cr(VI), and with the product represent Cr(III). This mechanism demonstrates the hydrolysis of borohydride ions that occurs during reduction of Cr(VI) to Cr(III). The borohydride intermediates in acidic environments can negatively affect the reduction of Cr(VI) due to a high redox potential energy of Cr(VI). The goal for this project is to investigate ideal reaction conditions that optimize the reduction of Cr(VI) using a buffered solution that is deemed economically and environmentally efficient. In this project we also plan to use a water insoluble solid-supported form of borohydride that can be easily removed after the reduction of Cr(VI) to Cr(III).
Materials and Methods MP-Borohydride
Dushaj
The Manhattan Scientist, Series B, Volume 5 (2018)
195
Borohydride ions are attached to a macroporous polystyrene-supported cationic ion exchange R through the agency of ionic bonds between the anionic borohydride and the resin, Amberlite , cationic resin [6]. The resin is not only stable in a slightly alkaline environment, but also insoluble in water which is favorable for gravity filtration after the reaction. It undergoes an electron transfer reducing the Cr(VI) The reaction of MP-borohydride and Cr(VI) is shown below: N(Et)3 BH4
O
Cr
N(Et)3 B(OH)4
O
O O
Cr O
O
O MP-borohydride
Cr(VI)
+
Cr(OH)3 Cr(III)
Potassium dichromate Potassium dichromate is a common inorganic oxidizing agent in many manufacturing applications which will be the source of Cr(VI). It is normally stable in acidic environments between pH 5 - 7. Cr(VI) is a strong oxidizing agent at low pHs and less so in an alkaline medium. A higher pH will result in a lower redox potential from Cr(VI) to Cr(III), which will act as a nucleophile and react with MP-borohydride more easily [7]. Therefore, adjusting the pH to 8 - 10 before the addition of MP-borohydride is essential for the stability of MP-borohydride and an excellent yield of Cr(III) in this experiment. Buffer solutions Buffer solutions utilized in the experiment are sodium borate, sodium bicarbonate, glycine, 2-amino-2-methyl-1,3-propanediol (AMP), and 2-(cyclohexylamino)ethanesulfonic acid (CHES). Sodium Borate is the standard buffer solution to prove the reaction of MP-borohydride in an alkaline solution is effective for the reduction of Cr(VI).
Experimental Procedure
The reaction of Cr(VI) with MP-borohydride (MP-BH4 ) and a buffer in an alkaline solution with a pH of 8 - 10 was investigated [5]. A stock solution of 200 µM potassium dichromate and 1000 µM sodium borate was prepared. Drops of 10% NaOH solution was added to a 50 mL round bottom flask till the pH of 8 - 10 was reached. In a typical reaction 20 mg of MP-BH4 were added to 20 mL of 200 µM of potassium dichromate with a buffer solution. The reaction was stirred at 300 rpm on a magnetic stirrer for 2 hours and used a UV/Vis spectrophotometer to measure Cr(VI) concentrations at a peak height at 373 nm.
Results and Discussion
When MP-BH4 was added to a solution of potassium dichromate with sodium borate buffer in a stirring environment for 2 hours, the concentration of Cr(VI) was effectively reduced as seen in Table 1.
196
The Manhattan Scientist, Series B, Volume 5 (2018)
Dushaj
Table 1. Cr(VI) reduction with MP-BH4 in 1000 µM sodium borate buffer Initial Cr(VI) Concentration
MP-BH4 Mass
Final Cr(VI) concentration
200 µ M
20 mg
0 µM
These results (Table 1) show that Cr(VI) is completely reduced, indicating that the reaction can be accomplished in a heterogeneous environment. This reaction effectively reduced 20 mL of potassium dichromate, and UV/Vis spectroscopy shows that at 373 nm there is no peak compared to the initial concentration of Cr(VI). This experiment successfully reduced Cr(VI) to levels that are below EPA’s standard of safe drinking water. However, to be more efficient, we investigated the final Cr(VI) concentrations by running reactions at different volumes of Cr(VI) and different molar ratios. Samples with volumes of 10 mL through 50 mL were also investigated and, for volumes at or below 30 mL, they successfully reduced a significant amount of Cr(VI) in 2 hours: Table 2. Different volumes of 200 µM of Cr(VI) in 1000 µM sodium borate Cr(VI) Volume (mL)
Molar Ratio MP-BH4 :Cr(VI)
Final Cr(VI) Concentration (µM)
10 20 30 40 50
30 15 10 7 6
0 0 6 42 50
Table 2 indicates that reduction of Cr(VI) at much lower molar ratios is possible. Intuitively, as the volume of Cr(VI) increases, the rate of reaction decreases. The next variable that we focused on was buffer concentration. We wanted to determine the reduction of the borohydride bead in the reaction by changing the sodium borate concentration from 1000 µM to zero. Table 3 summarizes the results using 20 mL of Cr(VI) and 20 mg of MP-borohydride. Table 3. Changes in borate concentration for Cr(VI) reduction Borate concentration (µM)
Final Cr(VI) concentration (µM)
1000 100 50 25 0
0 0 6 12 150
The results on Table 3 show that reduction of Cr(VI) occurs at low concentrations of sodium borate. This part of the experiment is important to address an ideal reaction that is still able to effectively reduce Cr(VI) while being economically efficient in the small-scale simulation. To
Dushaj
The Manhattan Scientist, Series B, Volume 5 (2018)
197
determine if different buffer solutions are also effective, a series of different structural classes were used. The results are summarized in Table 4. Table 4. 25 µM buffer solutions and 200 µM Cr(VI) with 15 molar equivalents of MP-BH4 Entry
1
Structure
Buffer Name
Sodium Borate
2
Sodium Bicarbonate
Glycine
3 4
4 5
5
20 ± 12
4±4
Sodium Bicarbonate
2 3
20 ± 12
Sodium Borate
1
Final Cr(VI) Concentration (µM)
4±4
11 ± 11
Glycine 2-amino-2-methyl22 ± 17 2-amino-2-methyl-1,31,3-propanediol
propanediol
22±2 (cyclohexylamino)ethanoic 2-(cyclohexylamino)ethanoic acid acid
11 ± 11 22 ± 17
2±2
Buffer solutions in Table 4 with different functional groups provide high levels of Cr(VI) reduction (>90% reduction on average). Therefore, the buffer solutions are not considered coreagents in the overall reaction; they merely restrict change of pH by absorbing excess protons and hydroxide ions. Glycine and sodium bicarbonate are both environmentally friendly buffers.
Figure 3. One hour stirring 20 mL Cr(VI) reductions
198
The Manhattan Scientist, Series B, Volume 5 (2018)
Dushaj
Another variable for effective Cr(VI) reduction is the stirring environment. Each experiment was conducted with different stirring speeds. 20 mL of a stock solution with 100 µM potassium dichromate, 25 µM sodium borate, and 20 mg MP-BH4 were used for each trial. The results indicate that stirring environments of 100 - 500 rpm can have an impact on the kinetics of the Cr(VI) reduction. At 500 rpm a slightly improved reduction profile is observed throughout the experiment.
Conclusions Various Cr(VI) experiments show that MP-BH4 reduces Cr(VI) from wastewater using low molar ratios. It is easily removed by filtration and does not contribute to water pollution. The most environmentally friendly buffers glycine and sodium bicarbonate are also effective at low concentrations in a dynamic stirring environment.
Acknowledgements This work was supported by the Michael J. ’58 and Aimee Rusinko Kakos Endowed Chair in Science. The author would like to express his gratitude to his mentor, John Regan, for guiding his research, and for giving him the opportunity to gain research experience.
References [1] Jacobs, J. and Testa, S. M. 2004. Overview of Chromium(VI) in the Environment: Background and History. [ed.] J Guertin, C.P Avakian and J.A. Jacobs. Chromium(VI) Handbook. s.l.: CRC Press, 2004, pp. 1-21. [2] Dayan, A. D and A. J. Paine. 2001 Mechanisms of Chromium Toxicity: Carcinogenicity and Allergenicity: Review of the Literature from 1985 to 2000. Huaman Experimenal Toxicol., Vol. 20, pp. 439-451. [3] US-EPA, U.S. Environmental Protection Agency. 2006. Basic Information About Chromium in Drinking Water http://water.epa.gov/drink/contaminants/basicinformation/chromium.cfm [4] Demirci, U. B, and P. Miele. 2014. Reaction Mechanisms of the Hydrolysis of Sodium Borohydride: A Discussion Focusing on Cobalt-Based Catalysts. Comptes Rendus Chimie, Elsevier Masson [5] Khain, V. S. 1988. Reduction of Cr(VI) to Cr(III) with Sodium Borohydride in an Alkaline Medium, Inorganic Material, Vol 24, Issue 3. [6] Cook, M. M. et al. 1978. Polymer Preparation Am. Chem. Soc. Div. Polymer Chemistry Vol. 19, pp1369. [7] Xafenias, N., Y. Zhang, and C. J. Banks. 2015. Evaluating hexavalent chromium reduction and electricity production in microbial fuel cells with alkaline cathodes, Int. J. Environ. Sci. Technol. 12: 2435–2446. DOI 10.1007/s13762-014-0651-7
Core-shell nanoparticles as photocatalysts to purify water Hannah Mabeyâ&#x2C6;&#x2014; Department of Chemistry and Biochemistry, Manhattan College Abstract. There were three main goals of this research: the first was to synthesize magnetic nanoparticles. We focused on Fe3 O4 because it is magnetic. After determining that the nanoparticles that were synthesized were in fact Fe3 O4 and not a different species, the second goal was to coat the nanoparticles with a TiO2 shell using a method that was developed. The third goal in this research was to determine how the TiO2 coated n anoparticles behave when exposed to an alternating magnetic field. After taking scanning electron microscope (SEM) images of the synthesized nanoparticles, measurements of the nanoparticles were taken using ImageJ software. From analyzing the SEM images and the measurements that were taken of the nanoparticles it was concluded that there was a uniform TiO2 coating on the Fe3 O4 nanoparticles, and therefore the coating method that was developed was successful. The magnetic core of the nanoparticles allows for easy removal after being used for photocatalysis. The TiO2 shell allows for the Advanced Oxidation Process (AOP) to occur when exposed to UV light (365 nm). Magnetite (Fe3 O4 ) is ferrimagnetic and was used in an induction heater with an alternating current magnetic field. This causes localized heating around nanoparticle clusters, which can cause a rise in temperature of the solution and induce hyperthermia to microorganisms, and thus purify water. The synthesized nanoparticles work to purify water in two ways: through AOP and through use in an induction heater. The fact that they are magnetic allows for easy removal of nanoparticles from the solution after they have been used to purify water. This can minimize any toxicity concerns of the nanoparticles remaining in solution.
Introduction Polluted water is a problem that affects water sources from India to America and everywhere else in between with varying degrees of severity. According to the U.S. EPA, 1.2 trillion gallons of untreated industrial waste and sewage is dumped into U.S. waters yearly. Water pollution is caused by the disposal of untreated industrial and domestic wastewater containing pathogens and organic chemicals [1, 2]. The main way to minimize pollution concerns is to remove any contaminants from these wastewater sources before the effluents are disposed into water sources. Water purification techniques that are in use are less than ideal: they can generate harmful byproducts, be high in cost, and not highly effective in disinfecting water form pathogens [2]. The problem of water pollution could be diminished through purification techniques that are environmentally friendly, low in cost, and efficient. An alternative technique to purify water is the Advanced Oxidation Process (AOP). The major mode of purification is by the formation of reactive oxygen species (ROS) through the generation of hydroxyl radicals which then proceed to oxidize any organic pollutants present in the water. The main benefits of using the AOP as a purification technique is that it does not generate any harmful byproducts and is low in cost [3, 4, 5]. TiO2 is used in this research to allow for the AOP to occur when exposed to UV light (365 nm). TiO2 is a semiconductor, and therefore has a filled valence band and an empty conduction band allowing TiO2 to have a small band gap. When TiO2 is subjected to UV irradiation, a photon that has an energy greater than the band gap within TiO2 â&#x2C6;&#x2014;
Research mentored by Hossain Azam, Ph.D., and Alexander Santulli, Ph.D.
200
The Manhattan Scientist, Series B, Volume 5 (2018)
Mabey
excites valence band electrons to the conduction band. The result of this is excited state conduction band electrons and positive valence band holes, which form ROS. These ROS can react with any organic pollutants present in contaminated water, yielding purified water [5] (See Fig. 1). Oxidation intermediates
O2−
Photo-reduction
Excitation
O2 Recombination
Energy
CB e−
hν
OH
Eg
VB h+
Mineralization products
Organic pollutants
H2O/OH−; R
Photo-oxidation
OH; R+ Organic pollutants
Oxidation intermediates
Mineralization products
Figure 1. Mechanism of ROS (reactive oxygen species) formation
Experimental Synthesis of Fe3 O4 nanoparticles Iron (III) chloride and iron (II) chloride and were mixed in a 2:1 molar ratio. This solution was stirred and heated at 50◦ C for about 10 minutes, and ammonium hydroxide was added to the solution. This caused black iron oxide nanoparticles to precipitate out. The iron oxide nanoparticles were collected using strong magnets, washed with water, and dried overnight at 100◦ C [6]. Overall Reaction: Fe2+ + 2Fe3+ + 8OH− → Fe3 O4 + 4H2 O Where the formula for Fe3 O4 can be written as FeO • Fe2 O3 showing that iron occurs in the +2 and the +3 oxidation states in a 1:2 ratio [7]. Coating the nanoparticles 500 mg of the synthesized Fe3 O4 nanoparticles were dispersed, using sonication, in a mixture of 10 mL hexane and 4 mL tetra butyl orthotitanate (TBOT). The nanoparticles were left to sit in the TBOT and hexane mixture for about 30 minutes; then they were collected from the solution using strong magnets. The collected nanoparticles were washed with hexane 2-3 times to remove any impurities. They were then annealed at 500◦ C for 30 minutes to reduce the TBOT on the surface of the Fe3 O4 nanoparticles to TiO2 (see Fig. 2 below).
Mabey
The Manhattan Scientist, Series B, Volume 5 (2018)
201
Figure 2. Coating method that was developed to produce TiO2 coated Fe3 O4 core-shell nanoparticles
Data and Results From analyzing the XRD taken of the synthesized Fe3 O4 nanoparticles (Fig. 3) and comparing it to the literature XRD, it can be concluded that Fe3 O4 nanoparticles were, in fact, successfully produced. Figs. 4 and 5 show that the nanoparticles that were made are on the nanoscale. The SEM images also show that there is a uniform spherical structure throughout; this is important in Fig. 5 because it can be seen that there are not two separate species present but that TiO2 is evenly distributed over the surface of the magnetite nanoparticles. From analyzing the SEM images as well as the measurements taken, shown in Table 1, it was determined that there is in fact a uniform TiO2 coating on the nanoparticles because the mean length of the coated nanoparticles is larger than the mean length of the uncoated Fe3 O4 nanoparticles.
Figure 3. X-ray diffraction of synthesized magnetite nanoparticles compared to literature XRD [electrochemsci.org]
Table 1. Measurements taken of the synthesized nanoparticles using ImageJ software Nanoparticle
Mean length (nm)
S.D. (nm)
Fe3 O4 TiO2 coated Fe3 O4
10.991 16.883
1.871 2.493
202
The Manhattan Scientist, Series B, Volume 5 (2018)
Figure 4. Scanning electron microscope images of the synthesized Fe3 O4 nanoparticles
Mabey
Figure 5. Scanning electron microscope images of TiO2 coated Fe3 O4 nanoparticles
E. coli only E. coli + Fe₃O₄ E. coli + TiO₂ E. coli + TiO₂ coated Fe₃O₄
150
100
50
0 0
50
100
150
irradiation time (min)
Figure 6. E. coli disinfection in the presences of various nanoparticles when exposed to UV light (365 nm)
E. Coli concentration (number of colonies)
E. coli concentration (number of colonies)
After concluding that TiO2 coated Fe3 O4 core-shell nanoparticles had been synthesized, the efficacy of the nanocatalysts was tested. E. coli was used in this research to determine whether the nanoparticles could disinfect polluted water samples when exposed to UV irradiation (365 nm). In Fig. 6 it can be seen that the nanoparticles did show good results in the disinfection of E. coli. 200
150
100 E. coli only E. coli + Fe₃O₄
50
E. coli + TiO₂ E. coli + TiO₂ coated Fe₃O₄
0 0
50
100
150
irradiation time (min)
Figure 7. E. coli disinfection in the presence of various magnetic nanoparticles when exposed to an induction heater
The purpose of the magnetic core of the nanoparticles was to be used in an induction heater to see if, when exposed to an alternating current magnetic field, there would be an increase in the temperature of the polluted solution containing the magnetic nanoparticles and E. coli. Fig. 7 shows that the magnetic nanoparticles induced hyperthermia to E. coli cells, which resulted in the disinfection of the polluted water sample. Methylene blue, which is an organic dye, was used in this research as a simulated pollutant. It was found that the TiO2 -coated Fe3 O4 core-shell nanoparticles were able to degrade methylene
Mabey
The Manhattan Scientist, Series B, Volume 5 (2018)
203
blue when irradiated by UV 1.1 light (365 nm). This can be seen in Fig. 8. 1.0
I/I0
0.9
Control TiO₂ Fe₃O₄ TiO₂ coated Fe₃O₄
0.8
0.7 0
50
100
150
200
time (min)
Figure 8. Methylene blue degradation in the presence of various magnetic nanoparticles when exposed to UV light (365 nm)
Conclusion It was determined that TiO2 coated Fe3 O4 nanoparticles were successfully synthesized. The catalyst efficacy of these synthesized magnetic/TiO2 nanoparticles was then tested and it was established that they worked as catalysts to purify water in two ways: through the advanced oxidation process and by inducing hyperthermia to microorganisms when exposed to an AC magnetic field. The TiO2 coating on the magnetic Fe3 O4 magnetic particles allows for the advanced oxidation process to occur when the nanoparticles are exposed to UV irradiation (365 nm). This can produce reactive oxygen species which will then oxidize any organic contaminants present in the polluted water samples. The catalyst efficacy was also tested using an induction heater. It was determined that when the magnetic/TiO2 nanoparticles were exposed to an alternating current magnetic field, there was a temperature increase of 2-3◦ C within 180 minutes. As the current of the magnetic field alternates, the magnetic moment of the nanoparticle flips with the current. The kinetic energy is proportional to temperature; therefore, as the nanoparticle moves with the alternating current there is an increase in temperature around the clusters of magnetic nanoparticles. The collected data showed that when E. coli was in solution with the synthesized nanoparticles and exposed to an induction heater the microorganisms were killed. From these data it can be concluded that the synthesized magnetic nanoparticles were able to induce hyperthermia to E. coli cells when exposed to an induction heater. Finally, it was determined that the magnetic nanoparticles can be easily removed from solution after they have been used as catalysts to purify water. The Fe3 O4 core of the core-shell nanoparticles allows the nanoparticles to be collected and taken out of solution easily because Fe3 O4 is ferrimagnetic, meaning it is attracted to magnets. Therefore, strong magnets can be used to collect
204
The Manhattan Scientist, Series B, Volume 5 (2018)
Mabey
the magnetic/TiO2 nanoparticles from solution after their use. Since the nanoparticles can be taken out of solution so easily, the concern of toxicity of the nanoparticles is reduced, and this could also allow for easy reuse of the nanocatalysts.
Acknowledgements This work was financially supported by a donation from Kenneth G. Mann ’68. The author would like to thank her faculty mentors.
References [1] WHO/UNICEF, Progress on sanitation and drinking water. 90 (2015). [2] “Rivers and Streams.” Environmental Protection Agency (EPA), 13 Mar. 2013. [3] Bethi, B., S. H. Sonawane, B. A. Bhanvase, and S. P. Gumfekar. “Nanomaterials-based advanced oxidation processes for wastewater treatment: A review.” Chemical Engineering and Processing, 109, 178–189 (2016). [4] Bai Gajbhiye, S. “Photocatalytic degradation study of methylene blue solutions and its application to dye industry effluent.” International Journal of Modern Engineering Research (IJMER), 2(3), 1204-1208 (2012) [5] Andreozzi, R., V. Caprio, A. Insola, and R. Marotta. “Advanced oxidation processes (AOP) for water purification and recovery.” Catalysis Today, 53, 51–59 (1999) [6] Wei, Y., B. Han, X. Hu, Y. Lin, X. Wang, and X. Deng. “Synthesis of Fe3 O4 nanoparticles and their magnetic properties.” Procedia Engineering, 27, 632-637 (2012). [7] Marghussian, V. “Nano-Glass Composites Processing, Properties and Applications.” 181-223 (2015)
Enhancing enzymatic fuel cells with nanotechnology Seth Serranoâ&#x2C6;&#x2014; Department of Chemistry and Biochemistry, Manhattan College Abstract. Using renewable energy can help preserve our environment. Enzymatic fuel cells are a renewable energy source that is not widely used because it does not produce very high current densities. Nanotechnology could possibly increase the current density of these enzymatic fuel cells by increasing the surface area of the electrode in the anode. Towards this end, we improved techniques for making clean, well-formed gold and nickel nanowires, and took steps towards making of nanowire arrays. We also investigated the affinity of gold and nickel to glucose oxidase, our enzyme of choice, and amino acids, that could facilitate the current producing reaction.
Introduction According to a report [1] released in April of 2018 by the U.S. Energy Information Administration, an agency within the U.S. Department of Energy, energy consumption has been steadily increasing for decades and, to meet this increasing demand, so has energy production (Fig. 1A). This is not surprising. Unfortunately, neither is the fact that most of the energy being produced and
Figure 1. (A) Overview on energy in the United States of America from 1949 to 2017. (B) Sources of energy production in the United States from 1949 to 2017 â&#x2C6;&#x2014;
Research mentored by Alexander Santulli, Ph.D.
206
The Manhattan Scientist, Series B, Volume 5 (2018)
Serrano
consumed is not ecofriendly. The production of natural gases and crude oil has increased most significantly to meet the higher energy demands (Fig. 1B). The report also shows a relatively steady, although unimpressive, rise in the use of renewable energy starting as early as 1955. It is our hope that an increase of renewable energy production would make less ecofriendly alternatives obsolete. Enzymatic biofuel cells are one of these ecofriendly ways of producing energy because they convert organic compounds into electrical energy. Enzymatic fuel cells work much like galvanic fuels. The fundamental difference between the two is the catalyst. In traditional fuel cells, metals are used as the catalyst whereas in enzymatic biofuel cells, enzymes are used as the catalyst. Enzymes are cheaper to produce than the metal alternatives are to acquire. These catalysts are responsible for the movement of electrons. The enzyme and substrate we chose to work with were glucose oxidase and glucose respectively. Glucose oxidase is a catalyst in the oxidization of glucose, a process that can be seen in Fig. 2. The electrons released as a product of this oxidation reaction go from the anode electrode, across a wire and to the cathode (Fig. 3). The electrons moving across the wire are what generate the current. Unfortunately, the current density produced by enzymatic fuel cells is too low to be a viable alternative to more conventional, yet environmentally detrimental, sources of energy [2].
Figure 2. The oxidation reaction catalyzed by glucose oxidase that would be the basis for the current produced by our enzymatic fuel cell. (http://2017.igem.org/Team:ManhattanCol Bronx)
Figure 3. Enzymatic fuel cells work much like galvanic fuel cells, as can be seen in this diagram above. (https://www.pharmatutor.org/articles/advances-laccase-enzyme-industrial-biotechnology-review?page=4)
Serrano
The Manhattan Scientist, Series B, Volume 5 (2018)
207
Increasing the surface of the anode could work towards increasing the current density of an enzymatic fuel cell [3]. Our project was to observe if the increased surface area provided by nanowires on the anode could significantly increase the current density of enzymatic fuel cells. The more surface area that an anode has, the more glucose oxidase could be in contact with the anode, catalyzing the oxidation of glucose, and releasing electrons to the anode and across the wire to the cathode.
Methods The U-tube method was used to make the nanowires (Fig. 4) [4]. To make nickel nanowires,
Figure 4. Three U-tubes are shown. Ni is precipitating in the membrane
a 0.075 mol solution of NaBH4 was put on one side of the U-tube and a 0.05 mol solution of NiCl2 was put on the other. To make gold nanowires, a 0.05 mol solution of HAuCl4 was used instead of the nickel solution. Between the two solutions, was a porous polycarbonate membrane. Membranes with different sized pores were used throughout the course of the experiment to see which would yield the most useful product. The diameters of the pores varied from 15 nm to 200 nm. The solutions were poured into the U-tube simultaneously, one on each side of the membrane. As they diffused into the pores of the membrane, a precipitation reaction occurred depositing the nickel or gold precipitate in the pores. The precipitate conformed to the shape of the pore and nanowires were formed in the pores of the polycarbonate membrane. The optimal time for this reaction to run was determined by performing the reaction for different lengths of time. Once the reaction was completed, the polycarbonate membrane was washed with methylene chloride. If there was excess precipitate on the outside of the surface of the membrane, it was polished off with a wet stone. The most effective way to wash away the polycarbonate was established throughout the course of the research. First, we put the polycarbonate membrane in a test tube and added methylene chloride until it appeared that the membrane had dissolved entirely. Next, the mixture was vortexed for about 20 s before it was centrifuged for 7 min at 4000 rpm so that the nanowires could be separated from the methylene chloride and the polycarbonate dissolved in it. The supernatant was poured out and this process was repeated, starting from adding methylene
208
The Manhattan Scientist, Series B, Volume 5 (2018)
Serrano
chloride, five times. It was later determined that it would be best to let the dissolved polycarbonate membrane sit in the methylene chloride for at least thirty minutes during the first wash. Scanning Electron Microscopy (SEM) was done on these nanowires at Fordham University to observe their shape and purity. The nickel and gold nanowires were tested for their affinity to glucose oxidase. In 4.75 mL of a reaction buffer, approximately 6 mg of Au was tested for its affinity for glucose oxidase. 100 ÂľL of glucose oxidase was added and 150 ÂľL of deionized water. This was placed on a mixer for two hours, centrifuged, and the supernatant was poured off. The nanowires left were washed twice with 1 mL of the buffer solution. The nanowires were then dried in the oven. Infrared spectroscopy was used to see if the amino acids stuck to the nanowires. A similar process was done with Ni nanowires except only 3 mg of Ni nanowires were used. We also tested the affinities of the metals to certain amino acids, hypothesized to have affinities to both the glucose oxidase and metals because of the presence of a thiol group on the amino acids. We tested l histidine for nicke and cysteine for gold. First, the nanowires were stirred in 10 mg/mL water solution of their respective amino acids for at least two hours, washed thrice in water, and tested via IR. Next, the concentration of the solution was increased ten times and the process was repeated. Finally, measures were taken to increase the affinity of nickel to histidine by oxidizing the nickel nanowires with hydrogen peroxide before mixing them in the amino acid solution. For the gold, a glycine buffer was used to make the solution more basic and increase the affinity of the cysteine to the gold nanowires. We also attempted to make arrays of the nanowires. Using a sputter coater, one side of a polycarbonate membrane with nanowires already was coated in gold. The membrane was then cut into eighths. We tried using methylene chloride vapors to carry away the polycarbonate membrane (Fig. 5), putting it into the furnace, and even plasma etching in a modified microwave (Fig. 6).
Figure 5. Left, set up used to contain and channel the methylene chloride vapors. Right, the polycarbonate membrane held down by pieces of microscope slides as it was exposed to methylene chloride.
Figure 6. The plasma produced in the modified microwave oven was visible, but not uniform.
Serrano
The Manhattan Scientist, Series B, Volume 5 (2018)
209
Results
The optimal time to allow a U-tube to run was established as two hours for gold and four hours for nickel. Many times, when we ran U-tube reactions with nickel for less than four hours, there were too few nickel nanowires left to work with after washing. Gold, however, had a much higher yield. When gold was run for longer then two hours, there was often a deterioration of the membrane so that it would tear easily. The sizes of the pore that we found most useful was 200 nm and 100 nm. When sizes smaller than these were used, the U-tube reactions took much longer. If they were stopped prematurely, the yield would be very low. SEM images were taken showing how well washed they were, as well as their size and shape (Figs. 7 and 8). The widths of the wires were also measured from the SEM images and these measurements analyzed (Table 1).
Figure 7. SEM images of gold nanowires: (A) 200 nm, (B) 100 nm, (C) 50 nm, and (D) 15 nm.
Table 1. The widths of nanowires were measured in microns from the SEM images and analyzed. Gold nanowire widths (Âľm) Mean SD Min Max
Nickel nanowire widths (Âľm)
200 nm
100 nm
50 nm
15 nm
200 nm
100 nm
50 nm
15 nm
0.125 0.050 0.051 0.236
0.162 0.028 0.109 0.256
0.101 0.017 0.057 0.144
0.050 0.018 0.023 0.100
0.280 0.042 0.181 0.375
0.141 0.032 0.106 0.193
0.105 0.023 0.065 0.165
0.052 0.013 0.024 0.089
210
The Manhattan Scientist, Series B, Volume 5 (2018)
Serrano
Figure 8. SEM images of nickel nanowires: (A) 200 nm, (B) 100 nm, (C) 50 nm, and (D) 15 nm.
The assays done to determine the affinity of glucose oxidase, histidine and cysteine to the metals could not be properly assessed with IR analysis. Initially, we observed that the glucose oxidase did not adhere to the metals because of a lack of carbon peaks in the IR. The amino acids appeared to adhere to the metals with varied degrees of success in the two tests that we performed. However, when we tested the oxidized nickel and the glycine buffer, we realized that the carbon chain peaks that we were using to determine if our amino acid was present were already there thus invalidating our results. An attempt to use methylene chloride vapor to get rid of the polycarbonate membrane and leave an array of nanowires did not work. We used microscope slides to hold down the edges of the membrane because when introduced to the methylene chloride vapors the membrane would curl up on itself. The polycarbonate membrane was affected by the methylene chloride vapors. It appeared to melt onto the surface of the microscope slide, i.e. the nanowires were not left standing in an array but were smeared with the membrane. In the modified microwave, the plasma was not always produced and when it did form it was not uniformly present in the bottle. Also, it was so hot that while the plasma was capable of oxidizing the polycarbonate membrane and leaving an array, instead, the nanowires melted together.
Serrano
The Manhattan Scientist, Series B, Volume 5 (2018)
211
Discussion To test if the increased surface area provided by nanowires will increase the current density of an enzymatic fuel cell, the making of an array would be ideal. Unfortunately, we were unable to succeed in making arrays. However, a possible next step would be to put a drop of the washed nanowires on an anode, let the liquid evaporate and test if the increased surface area produced by the nanowires, although not optimal, will indeed increase the current density of the fuel cell. In addition, since the plasma technique appeared to be the most promising, further modifying the microwave so that there is a continuous vacuum of our bottle while the plasma is being formed is another plausible next step. The average widths of the nanowires were interesting. Although the widths did not exactly equal the sizes of the pores, for gold and nickel there did appear to be a relationship between pore size and nanowire width, which was to be expected except for the 200 nm samples. For the 200 nm samples, the average width of the gold nanowires was much smaller than the average width of the nickel nanowires. While the sizes of the nanowires were not a primary component of our present investigation, if nanowires can indeed be used to increase the current density of enzymatic fuel cells, it will be worth considering the ideal size for the nanowires to achieve the maximum increase in current density. We successfully made nanowires and designed experiments to test for the affinity of these nickel and gold nanowires to glucose oxidase and amino acids. Deciding on a more effective way to asses the results of these experiments, and performing these tests again, is another way for this project to be advanced. Enzymatic fuel cells can be used in bioremediation efforts, producing energy from waste water. They can also be used in pacemakers so that the pacemakers will run off the sugars produced in the hostâ&#x20AC;&#x2122;s body. The benefits of increased studies in this field can result in great ecological and medicinal advancements.
Acknowledgments This work was supported by the Linda and Dennis Fenton â&#x20AC;&#x2122;73 endowed fund for biology research. The author would like to thank Dr. Alexander Santulli for being his advisor and whose ingenuity was one of the greatest assets. A special thanks goes to Dr. Bryan Wilkins and his team for their insights on the glucose oxidase enzyme.
References [1] Primary energy overview; Primary energy production by source. U.S. Energy Information Administration (2018). https://www.eia.gov/totalenergy/data/monthly [2] Kumar, R., Singh, L., Zularisam, A. W., and Hai, F. I. Microbial Fuel Cell Is Emerging as a Versatile Technology: A Review on Its Possible Applications, Challenges and Strategies to Improve the Performances. Int. J. Energy Res. 2018, 42 (2), 369-394.
212
The Manhattan Scientist, Series B, Volume 5 (2018)
Serrano
[3] Sakimoto, K. K., Liu, C., Lim, J., and Yang, P. Salt-Induced Self-Assembly of Bacteria on Nanowire Arrays. Nano Lett. 2014, 14 (9), 5471-5476. [4] Koenigsmann, C., Santulli, A. C., Sutter, E., and Wong, S. S. Ambient Surfactantless Synthesis, Growth Mechanism, and Size-Dependent Electrocatalytic Behavior of High-Quality, Single Crystalline Palladium Nanowires. ACS Nano 2011, 5 (9), 7471-7487.
Solar cells using nanowire technology Francisca Villarâ&#x2C6;&#x2014; Department of Department, Manhattan College Abstract. Being able to keep up with the demand for energy that is needed is an issue the world has been struggling with. This country uses more energy than any other part of the world and due to this scientist and engineers have been on the lookout for other innovations and sources to help keep up with our energy needs. They have been turning to renewable energy specifically solar energy to cut down on carbon emissions while maintaining the production of energy within the United States. However, solar panels are expensive in comparison to fossil fuels and are only about 18% efficient. Therefore, scientists and engineers trying to use renewable energy as an alternative to electricity generated from fossil fuels. In order to make renewable energies, like solar energy, more competitive with fossil fuels they are developing ways to minimize their cost and increase their efficiency of energy production. In recent years perovskite solar cells have improved to reach efficiencies of over 20% which is competitive with silicon solar cells. This is a big step towards reducing the cost of solar cells because perovskite solar cells are less expensive than their silicon counterparts. These perovskite solar cells are commonly produced using lead halides and an organic amine. In our work, we show that we can successfully synthesize nanowires of methylammonium lead iodide for use in solar cells.
Introduction
One of the biggest issues the world is currently facing is keeping up with the demand for energy as we become more dependent upon technology. Traditionally, this energy is produced through the consumption of fossil fuels, natural gas, and coal [1]. However, the need to minimize carbon emissions has pushed alternative, clean, energy sources to the forefront. Scientists and engineers have been developing and studying how to minimize the amount of carbon dioxide, sulfur dioxide, and several other harmful gases that end up in our atmosphere. One sustainable and green source of energy is solar panels. Studies show that traditional solar panels are able to convert approximately 20.1% [2] of the energy from the sun into useful electricity [3]. Although fossil fuels have a major environmental setback, solar panels also have a few setbacks of their own. The efficiency of solar panels is weather dependent which is not ideal since the weather is constantly changing and energy is constantly needed [4]. Additionally, solar panels cost many thousands of dollars which explains why the market for them has not been the most successful. Due to these major issues, scientists have been researching new materials used in solar panels, in hopes to find a more efficient and cost-effective solution. The research we conducted focused on producing Methylammonium Lead Iodide (MALI) nanowires. Previous works have used MALI in solar cells but mostly in the form of thin films. Our research utilized a template-based method to grow nanowires in polycarbonate membranes. We found that the concentration of reactants, as well as the reaction time, had the greatest effect on the growth of the nanowires. After producing the samples they were analyzed through Scanning Electron Microscopy (SEM) and X- Ray powder diffraction (XRD) to determine their shape and purity. â&#x2C6;&#x2014;
Research mentored by Alexander Santulli, Ph.D.
214
The Manhattan Scientist, Series B, Volume 5 (2018)
Villar
Materials and Methods Potassium iodide and lead nitrate To begin forming the methyl ammonium lead iodide wires, different concentrations, 0.1 M, 0.15 M and 0.2 M, of potassium iodide were tested. The lead nitrate concentration was held constant at 0.05 M throughout the experiment. 0.1 M, 0.15 M and 0.2 M were tested with the lead nitrate solution. Potassium iodide is a white neutral salt that, when combined with acidic lead nitrate, precipitates to form a neon yellow solid of lead (II) iodide. Lead (II) iodide, an affordable semiconductor, was used in this research, since our purpose has been to create efficient low-cost solar cells. U-tube method The U-tube method was used to confine the precipitation reaction occurring between the solutions of lead nitrate and potassium iodide. In between the two half cells of the U-tube, a thin polycarbonate membrane was secured. These polycarbonate membranes have nanometer-sized pores, which allows the lead iodide to grow outward in the shape of wires since the expansion of the pores is limited. Different membranes with different pore sizes were tested. Specifically, membranes with pore sizes of 50 nm, 100 nm and 200 nm were all tested in an attempt to produce high-quality lead iodide wires. The reaction was allowed to proceed for several hours before the solutions were removed and the reaction stopped.
Figure 1. U-tube setup precipitating lead iodide in a cell membrane
Methylammonium lead iodide nanowires Following the growth of the lead (II) iodide nanowires, there is a significant amount of solid on the surface of the polycarbonate membrane. It is important to remove this material from the surface because it has not been confined to the pores of the template and is not nanosized. This material
Villar
The Manhattan Scientist, Series B, Volume 5 (2018)
215
was removed by rubbing the template while immersed in a small amount of isopropyl alcohol in a watch glass. Subsequently, the polycarbonate membrane containing the lead (II) iodide nanowires was placed in different concentrations of methylammonium iodide in Isopropyl alcohol solutions for different periods of time (Fig. 2). Each concentration was tested along with every time period. 0.15 M, 0.20 M, 0.25 M, and a saturated solution of 0.4 M was also tested. The times tested for the conversion of lead iodide to MALI in the solution were 2, 4, 24, and 48 hours. X ray diffraction was conducted on the sample where it was loaded onto the sample holder under a flow of Nitrogen to avoid any moisture and tested for its purity. Similarly, Scanning Electron Microscopy (SEM) was used to visualize the wires to see if the reaction was truly confined to the pores of the polycarbonate template.
Figure 2. Lead iodide cell membrane being converted to methylammonium iodide in isopropyl alcohol solution
Figure 3. Solar cell being spin coated with dimethylformamide (DMF) solution
Perovskite solar cell Solar cells have several layers to them including a fluorine doped tin oxide transparent conductive oxide layer, a layer of TiO2 , MALI layer, a chlorobenzene solution containing spiro- MeOATD, tert butyl pyridine and lithium salt, and gold (Au) layer all on a substrate. The TiO2 layer and the spiro-MeOTAD layer act to prevent the device from short circuiting, while the MALI layer acts as the light absorber. In order to determine the optimum amount of solution needed to produce a uniform film, various amounts of both the chlorobenzene and the MALI solutions were tested. The best results were found when 15 to 20 drops were deposited and then spin coated. In order to further determine the optimum speed, the spin coater was also tested at different speed settings in order to get an opaque yet thin film onto the substrate. The substrate was spun at approximately 1,000 - 2,000 rpm and tested at 8, 9 and 10 seconds.
216
The Manhattan Scientist, Series B, Volume 5 (2018)
Villar
Results Methyl ammonium lead iodide nanowires To ensure that successful methylammonium Lead Iodide wires were being formed both X Ray diffraction and SEM were conducted on every sample to ensure progress. X ray diffraction provides a unique pattern for every compound or element based on the crystal structure of the substance. To determine the purity of the MALI nanowires, we compared the X ray diffraction patterns of the MALI to the pattern from lead (II) iodide nanowires. At short conversion times, peaks corresponding to pure lead (II) iodide are still seen in the diffraction pattern (Fig. 4). This shows that the conversion time was not long enough, so a longer conversion time was used. Results in Fig. 5 show that after 24 hours in a 0.25 M solution of methylammonium iodide, the conversion is very nearly complete. Although, the longer it was left soaking in the methylammonium Iodide solution the chances of it converting back to lead iodide from random errors such as moisture was minimized. The small impurity of lead (II) iodide present, even at long conversion times can be a result of sample preparation. When exposed to moisture, the MALI decomposes into lead (II) iodide. The samples can easily collect moisture from the environment during evaporative drying. To minimize this effect, the samples are dried under a stream of dry nitrogen. The other concentrations of methylammonium iodide tested such as 0.15 M and 0.20 M did not show complete conversion, even at very long times.
Figure 4. X ray diffraction of lead (II) iodide nanowires (black) and MALI nanowires after 1 hour of conversion (red)
Figure 5. X ray diffraction of lead (II) iodide nanowires (black) and MALI nanowires after 24 hours of conversion (red)
Perovskite solar cells Another essential step aside from creating the nanowires was creating the solar cells. In the preparation of the solar cells the amount of both MALI in dimethyl formamide and chlorobenzene solution containing Spiro-MeOTAD that allowed for the most opaque and uniform film on the substrate where 15 drops of each. Similarly, the tested speeds concluded that spinning the substrate on the spin coater at 2000 rpm for 20 seconds is optimal. After every solar cell substrate was prepared it was tested for its voltage and current using a voltmeter. Unfortunately, the solar cells produced
Villar
The Manhattan Scientist, Series B, Volume 5 (2018)
217
displayed a voltage but not a current, which means that the devices were short circuited. Work is ongoing to produce functioning solar cells from the nanowires produced from this research.
Discussion Several simple techniques and easily accessible elements were used in this research for easy preparation and synthesis of the nanowires. Successful 200 nm methylammonium lead iodide nanowires were produced by manipulating the concentrations of solutions and the pore sizes of the cell membranes. By running X-ray after every sample the purity of the sample was determined. As shown in Fig. 5, very nearly pure MALI samples can be produces by using conversion times of 24 hours. Once it was determined that the conversion from lead (II) iodide to MALI was complete, SEM provided reassurance of the quality of the wires. As can be seen in Fig. 6, while the sample is MALI, the morphology is not ideal. There are a large number of wires present, but their lengths were shorter than those present in the lead (II) iodide sample, approximately 2 microns as compared to the 4.5 microns for the lead (II) iodide. The shortening of the wires indicates the breaking of the wires during transformation from lead (II) iodide to MALI. Since this conversion occurs in solution, we suspect this is caused by some dissolution of the wires into the isopropyl solution of methylammonium iodide. Further studies are required to confirm this and to minimize the dissolution.
Figure 6. (A) SEM image of 200 nm lead (II) iodide nanowires. (B) SEM image of 200 nm MALI nanowires after 24 hour conversion.
It was also found that the ideal concentrations of potassium iodide were different for different pore sizes. As the pore size decreased, the concentration of the potassium increased. The membranes with pore sizes of 200 nm required 0.1 M potassium iodide and 0.05 M lead nitrate to produce quality nanowires. Therefore this technique is successful in producing nanowires that could power solar panels at more efficient rates than traditional polycrystalline silicon solar cells. The solar cells produced in this research were unsuccessful in conducting a current to ensure a successful circuit of electricity being generated. In future research 200 nm MALI nanowires will
218
The Manhattan Scientist, Series B, Volume 5 (2018)
Villar
be placed into solar cells that are being produced to guarantee success between both the wires and production of electricity through the solar cells.
Acknowledgments This work was supported by the Michael J. ’58 and Aimee Rusinko Kakos Chair in Science Endowment. The author would like to thank Dr. Koenigsmann from Fordham University for the use of their SEM.
References [1] Clean Energy Institution. (n.d.). Perovskite Solar Cell. Retrieved October 14, 2018, from https://www.cei.washington.edu/education/science-of-solar/perovskite-solar-cell/ [2] Nagabhushana, G. P., Shivaramaiah, R., and Navrotsky, A. (2016). Direct calorimetric verification of thermodynamic instability of lead halide hybrid perovskites. Proceedings of the National Academy of Sciences, 113(28), 7717–7721. https://doi.org/10.1073/pnas.1607850113 [3] Boix, P. P., Agarwala, S., Koh, T. M., Mathews, N., and Mhaisalkar, S. G. (2015). Perovskite Solar Cells: Beyond methylammonium lead iodide. The Journal of Physical Chemistry Letters, 6(5), 898–907. https://doi.org/10.1021/jz502547f [4] Sendy, A. (2017). What does solar module efficiency mean? - Solar Estimate News. Retrieved October 14, 2018, from https://www.solar-estimate.org/news/2017-11-22-what-solar-moduleefficiency-mean-why-it-matters
Creating lead-free perovskite materials for a cleaner, greener future Amanda Zimnochâ&#x2C6;&#x2014; Department of Chemistry and Biochemistry, Manhattan College Abstract. With mounting concerns and dangers surrounding climate change in recent years, it has become apparent that a switch to green energy sources must occur. While green energy sources, such as solar energy, do exist, they are often more difficult and expensive to fabricate and use than non-green energy. Concerning solar energy, strides have been made in synthesizing new materials for use in solar cells that are cheaper and easier to manufacture. Most promising are perovskite materials; specifically, methylammonium lead iodide (MALI). While MALI shows great promise, there are still toxicity concerns surrounding it, seeing as it contains lead. Therefore, the aim of the current research was to synthesize novel lead-free perovskite materials for use in solar cells and test their efficiencies as such.
Introduction Perovskite materials have the general formula ABX3 , where A and B are cations and X is an anion. They are arranged as if eight octahedra of BX6 surround one central A atom on the corners of a cube (see Fig. 1). Organic-inorganic hybrid perovskite materials can be used to convert solar energy into electrical energy. Over the past few years the improvement in the performance of perovskite solar cells has been rapid, rising from 9% to over 20% [1]. These materials have high optical absorption properties combined with balanced charge transport properties and long carrier diffusion lengths, making them excellent candidates for photovoltaic devices.
Figure 1. The general structure of a perovskite material in which A and B are cations and X are anions.
Perovskite materials show a lot of promise for becoming the new standard in solar cells. First generation solar cells used silicon crystals, which are expensive but effective. Second generation â&#x2C6;&#x2014;
Research mentored by Alexander Santulli, Ph.D.
220
The Manhattan Scientist, Series B, Volume 5 (2018)
Zimnoch
solar cells used amorphous silicon, which is thinner and cheaper, but less effective. Third generation solar cells include organic, dye-sensitized, polymer, copper tin zinc sulfide, nanocrystals, micromorphs, quantum dots, and perovskite solar devices [2]. Perovskite materials fall into the third generation of solar cells. Compared to traditional silicon solar cells, perovskite cells are simpler and cheaper to manufacture. Silicon solar cells demand costly, multistep processes that are performed at high temperatures (over 1000â&#x2014;Ś C) in a highly evacuated chamber, whereas the organic-inorganic perovskite substances are fabricated by simple wet chemistry methods in a nonevacuated ambient surrounding [2]. Perovskite materials also exhibit a high optical absorptivity, meaning they can be made thinner (500 nm) as opposed to the typical silicon solar cell (2 Âľm) [2]. The material that has been the major focus in perovskite research is methylammonium lead iodide (MALI). In this material, the two cations are lead (II) and the organic ion methylammonium. While this compound has shown efficiencies exceeding 20%, there are several issues that must be addressed to improve these materials [1], e.g. the requirement of lead in the material, and its degradation in moist environments [3]. Our research project focused on removing the lead (II) from the perovskite material to mitigate these effects. Several reports have shown that there is a good number of variations that can be made with the perovskite material to fine tune the properties of the material [4]. We especially wanted to focus on cobalt replacing lead because previous studies had found that among all transition metals, cobalt showed the most promise and highest efficiency [4].
Materials and Methods Cesium copper iodide Method 1 One part 0.1 M cesium acetate in H2 O and one part of 1 M copper acetate were mixed. The solution was then added to two parts 1 M potassium iodide. Method 2 Solid copper iodide was heated in boiling 0.1 M cesium acetate in H2 O. Method 3 A 100 nm template of copper iodide nanowires was prepared in a U-tube using 0.5 M potassium iodide on one side and 0.5 M copper acetate on the other side. The template was then submerged in 1 M cesium acetate in H2 O and heated. Method 4 Solid copper iodide and solid cesium acetate were ground together using a mortar and pestle and the product was heated. Cesium lead iodide A 100 nm template of lead iodide nanowires was prepared in a U-tube using 1 M lead acetate on one side and 1 M potassium iodide on the other side. The template was then submerged in 1 M cesium acetate in isopropyl alcohol.
Zimnoch
The Manhattan Scientist, Series B, Volume 5 (2018)
Cobalt lead iodide One part of 0.1 M lead nitrate and one part 1 M cobalt nitrate were mixed. The solution was then added to two parts of 0.1 M potassium iodide. Copper lead iodide One part of 1 M copper acetate was mixed with a few drops of 0.1 M lead nitrate. The solution was then added to one part of 0.1 M potassium iodide. Methylammonium copper iodide Solid copper iodide was heated in boiling methylammonium iodide. Methylammonium lead iodide A 100 nm template of lead iodide nanowires was prepared in a U-tube using 0.05 M lead nitrate in H2 O acidified with a few drops of concentrated acetic acid on one side and 0.1 M potassium iodide in H2 O on the other side. The template was then submerged in 0.1 M methylammonium iodide and dissolved in methylene chloride to give methylammonium lead iodide nanowires. Methylammonium lead-cobalt iodide One part of 1 M cobalt acetate was mixed with a few drops of 0.1 M lead nitrate. The solution was then added to one part of 0.1 M potassium iodide. Methylammonium nickel iodide 0.88 M nickel iodide and 0.88 M methylammonium iodide were prepared in the same 5mL solution of DMF (dimethylformamide). Solids were filtered out and excess DMF was evaporated off the product. Methylammonium cesium-lead iodide A 100nm template of methylammonium lead iodide nanowires was prepared. The template was submerged in 0.1 M cesium acetate in isopropyl alcohol before being dissolved in methylene chloride. Methylammonium cobalt iodide 0.44 M cobalt iodide and 0.88 M methylammonium iodide were prepared in the same 5mL solution of DMF. Solids were filtered out and excess DMF was evaporated off the product at either 150â&#x2014;Ś C or 250â&#x2014;Ś C. Nickel lead iodide One part of 1 M nickel nitrate was mixed with a few drops of 0.1 M lead nitrate. The solution was then added to one part of 0.1 M potassium iodide.
221
222
The Manhattan Scientist, Series B, Volume 5 (2018)
Zimnoch
Results and Discussion To characterize the materials synthesized, we measured the optical, electronic, and structural properties. To explore the structural component of the material, we used a D2 Phaser powder X-Ray diffractometer. X-ray diffraction is a common technique used to study crystal structures and atomic spacing. It is based on constructive interference of X-rays and a crystalline sample [5]. These X-rays are generated by a cathode ray tube and directed toward the sample. The interaction of the incident rays with the sample produces constructive interference (and a diffracted ray) when conditions satisfy Bragg’s Law, nλ = 2d sin θ [5]. This law relates the wavelength of electromagnetic radiation to the diffraction angle and the lattice spacing in a crystalline sample. These diffracted X-rays are then detected, processed and counted. By scanning the sample through a range of 2θ angles, all possible diffraction directions of the lattice should be attained due to the random orientation of the powdered material [5]. Conversion of the diffraction peaks to d-spacings allows identification of the mineral because each mineral has a set of unique d-spacings. This is typically achieved by comparing the d-spacings with standard reference patterns [5]. The results of the XRD scans can be seen in Figs. 2-9.
Figure 2. Cesium copper iodide XRD
Figure 3. Cesium copper iodide from H2 O XRD
Figure 4. Cesium copper iodide from isopropyl alcohol XRD
Figure 5. Methylammonium lead iodide made from acidified lead iodide XRD
Zimnoch
The Manhattan Scientist, Series B, Volume 5 (2018)
Figure 6. Methylammonium cobalt iodide XRD, first attempt
Figure 7. Methylammonium cobalt iodide XRD, second attempt
Figure 8. Methylammonium cobalt iodide XRD, third attempt, using both high and low heat
Figure 9. Methylammonium cobalt iodide XRD, fourth attempt, using both high and low heat
223
Our main reason for using x-ray diffraction was to figure out if similar procedures produced the same end products. For example, Figs. 2, 3, and 4 all show scans of cesium copper iodide prepared using three different methods. Figs. 2 and 3 both have diffraction peaks at about 25, 30, 42, and 50 degrees, meaning that they are most likely the same compound. Fig. 4, however, has many different peaks that do not match up with the peaks in Figs. 2 and 3. Therefore, we can conclude that the product in the third scan is a different compound than those in the first two scans. These three scans on their own do not tell us the identity of what we synthesized. To determine whether we have in fact synthesized cesium copper iodide, these scans will have to be compared to scans of cesium copper iodide form the literature. Some samples were also analyzed using a Scanning Electron Microscope (SEM). An SEM image of synthesized copper iodide nanowires can be seen in Fig. 10, showing that we successfully synthesized copper iodide in nanowire conformation. The rest of our synthesized products should also be imaged using SEM to determine their morphology.
224
The Manhattan Scientist, Series B, Volume 5 (2018)
Zimnoch
Figure 10. Scanning electron microscope image of synthesized copper iodide nanowires
In conclusion, many different perovskite materials were attempted. We performed x-ray powder diffraction on all the samples, but more analytical techniques such as scanning electron microscope images and UV/Vis spectra will be performed in the future to further identify the products. The syntheses of the samples will be repeated to test their reproducibility. When a product is found to be reproducible and have high absorption properties, we will fabricate thin film solar cells, via a spin coating technique, using the product as the light absorbing material. We will then measure the solar cell performance using a solar simulator at Fordham University in collaboration with Dr. Christopher Koenigsmann.
Acknowledgments This work was supported by the School of Science Summer Research Scholars Program.
References
[1] Manser, J. S., Saidaminov, M. I., Christians, J. A., Bakr, O. M., and Kamat, P. V. Making and Breaking of Lead Halide Perovskites. Acc. Chem. Res. 49 (2), 330–338 (2016) [2] Ansari, M. I. H., Qurashi, A., and Nazeeruddin, M. K. Frontiers, opportunities, and challenges in perovskite solar cells: A critical review. Journal of Photochemistry and Photobiology C: Photochemistry Reviews. 35, 1-24 (2018) [3] Hailegnaw, B., Kirmayer, S., Edri, E., Hodes, G., and Cahen, D. Rain on Methylammonium lead iodide Based Perovskites: Possible Environmental Effects of Perovskite Solar Cells. J. Phys. Chem. Lett. 6 (9), 1543–1547 (2015) [4] Boix, P. P.;, Agarwala, S., Koh, T. M., Mathews, N., and Mhaisalkar, S. G. Perovskite Solar Cells: Beyond Methylammonium lead iodide. J. Phys. Chem. Lett. 6 (5), 898–907 (2015) [5] Dutrow, B. L. and Clark, C. M. X-ray Powder Diffraction (XRD). Retrieved from https:// serc.carleton.edu/research education/geochemsheets/techniques/XRD.html (2018)
Parallel GPU based simulation of a multi-layer neural network with multi-valued neurons James S. Abreu Mieses∗ Department of Computer Science, Manhattan College Abstract. Modern computer CPU (Central Processing Unit) processing speed and memory limitations greatly reduce the learning capabilities of artificial neural networks (ANN) and their applications - restricting the complexity and size of the problems that could be solved. In this paper, we are considering the parallel implementation of a multi-layer neural network with multi-valued neurons (MLMVN) on multiple NVidia CUDA (Computer Unified Device Architecture) enabled GPUs (graphical processing units) with the goal of speeding up its learning process. Keywords: CPU, GPU, CUDA, MLMN, ANN
Introduction In the recent years, technological advances in modern society have opened a world of opportunities to the scientific community, allowing for high-energy computational experiments with big data and finding solutions to increasingly complex problems that also require large amounts of data [1] to process and interpret. Modern CPU devices have become inadequate for large scale computations lacking the capability demanded in today’s society. The advent of NVIDIA’s CUDA enables GPU devices made it possible to quickly create and implement large scale algorithms and receive results greatly incrementing experiment throughput. This is the result of the flexibility and low learning curve of the CUDA API (Application Programming Interface) [2] developed and optimized for large scale applications. Some of the many examples of large scale applications implementation based on GPU, which increased experimentation speed and decreased required iterations include neural network applications in data mining [3], cybersecurity research, as stated in detail in [4], and blood analysis [5]. All involve tasks needing solutions in modern society that can only be obtained through years of research and exceedingly large amounts of data requiring large computation capabilities during processing. For the experiment at hand, we will be considering the implantation of a complex-valued neural network (MLMVN), a popular tool in machine learning, which is used to process and interpret large quantity of data given as a set [6]. After completion of the learning process, a neural network is capable to return an approximation of the wanted output but allowing for a margin of error to be present that can be adjusted later on in the learning process thus simulating the trial and error approach to learning; and although the machine learning family extents to provide a variety of tools to do the same task - meaning the processing of large data sets, MLMVN provides a few advantages above the other tools that can be quickly reviewed by comparing both MLMVN and it’s real ∗
Research mentored by Igor Aizenberg, Ph.D.
234
The Manhattan Scientist, Series B, Volume 5 (2018)
Abreu Mieses
valued counterpart [3]. For instance, MLMVN offers greater flexibility and better generalization capability than real-valued neural networks as illustrated in detail in [6]. In this paper we compare a serial version of MLMVN network application optimized for CPU utilization and the MLMVN application designed for GPU usage using the MATLAB Parallel Computing Toolbox offering an array of procedures and variables to facilitate programming on CUDA enabled GPU’s. Both applications are tested using two learning sets and an increasing number of neurons in two hidden layers, which we employed.
Related Works Neural networks have become the de-facto tool for large data processing applications [1] and their implementation on GPU have exponentially increased the speed at which results are gathered. One of the few examples of their use can be seen in [4] where neural networks are designed and optimized for CUDA-enabled GPU’s and given large collections of known viruses and malware registries that result in quickly identifying and tracking future threats by analyzing their patterns of attack and, therefore, enhancing the security of the increasingly digitized world of today. Another great example is detailed in [2], where a deep neural network application is implemented on multiple GPU’s to process and analyze large data sets generated from large scale transient systems - or power generators simulated over a wide area increasing the speed of processing up to 345 times in comparison to the CPU simulator.
Environment and MLMVN Before discussing the implementation of the MLMVN, we describe the environment in which the experiment was conducted and also explain the nature in which the complex valued neural network preforms the learning process. All simulations conducted in this work were done on a supercomputer system Dionysus of the Manhattan College School of Science containing twenty-four Intel Xeon 12-core CPUs with memory speed of 2.60 GHz (Fig. 1). The machine also containing 4 Tesla K20c GPUs with a clock speed of 2.6 GHz and a processing speed of 705 MHz over 2496 cores (Fig. 2).
Figure 1. The architecture of an Intel Xeon 7500 CPU Device (https://images.anandtech.com/doci/7852/E5-2%20dies.png)
MLMVN is a complex-valued neural network with a classical feedforward topology, with a great advantage of the much higher functionality of MVN (Multi-Valued Neuron) over the one
Abreu Mieses
The Manhattan Scientist, Series B, Volume 5 (2018)
235
Figure 2. The architecture of a CUDA enabled Tesla K20c GPU device (http://cdn.wccftech.com/wp-content/uploads /2012/08/NVIDIA-Kepler-GK110-Block-Diagram.jpg)
of real-valued neurons. MVN is based on the principles of the multi-valued threshold logic over the field of the complex value plane [7]. MLMVN and its backpropagation learning algorithm are presented for example, in [8]. The learning process of MVN and MLMVN involves the mapping of n inputs to the complex plane and an output onto the unit circle. This phenomenon is expressed in the form of a multi-valued function also called a k-valued function with n variables [7] and is written as such: f (x1 , ..., xn ) = P (w0 + x1 w1 + ... + xn wn ), where x1 ...xn are the inputs of the function, w0 is the free weight or bias, and w1 ...wn are the weights of a particular neuron. z = w0 + x1 w0 + ... + xn wn expresses the weighted sum of the neuron and P is an activation function of the neuron. Thus, f (x1 , ..., xn ) = P (z) and [8], P (z) = ei2πj/k , if 2πj/k ≤ arg(z) < 2π(j + 1)/k,
where i is the imaginary unit, and j = 0...k − 1 are the values of the k-value logic (multi-value).
As illustrated in [7], function P (z) divides the complex plane onto k equal sections on which the learning of the neuron find its way (moves) to the desired output by the updating the difference of the wanted output Tkm and the actual output of the neuron Ykm . Therefore, Ykm − is the actual output of the k th neuron from mth output layer Tkm − is the wanted output of the k th neuron from mth output layer
236
The Manhattan Scientist, Series B, Volume 5 (2018)
Abreu Mieses
and the global error of the k th neuron of the mth output layer is expressed as δ ∗ = Tkm − Ykm . The error is propagated back the network from the output neuron hence the name of the learning process. A neural network is a tool part of the Machine Learning (ML) family designed to process data from a given environment in a process called the learning process in which the network is given a set of learning samples to be trained without the assistance of a programmer resulting in the return of closest approximation of a wanted output by the network. A multi-layered neural network with multi-valued neurons (MLMVN) is one of the many tools part of the ML tool set. It is a classical complex valued feedforward neural network based on multi-valued neurons instead of sigmoidal neurons offering a larger capacity of flexibility. MLMVN has a derivative free learning procedure based on the error correction rule. The learning process is explained in detail in [8].
Implementation To begin, the neural network was optimized for improved processing speed on the CPU resulting on a speedup of 0.8×, compared to the standard serial application. Then, the MLMVN simulator employing a graphical processing unit (GPU) of the system by using parallel computing features provided by MATLAB’s Parallel Tool Box was designed. Our goal was to enhance the speed at which the neural network could process data (learn) from a data set by one, dividing a large work needed for the neural network to process data, among multiple workers (processing cores), and taking advantage of the four GPU’s currently installed on the machine. Operations such as large matrix multiplication or division could be processed as much as 4.6× faster in comparison to that of the CPU. However, the memory capacity of a single graphics card was incapable or running the simulation when, for instance, the neural network reached the size of 524,288 neurons. To solve this issue, the partition of very large matrices was needed to be performed to further take advantage of the four available Tesla GPU devices in the supercomputer.
Results The experimental results showed that as the size of the neural network increased, so did the processing speed of the graphical processing unit (GPU). It yielded an average speed-up of 3.15×, compared to the same application on the CPU, clocking an average of 5.36 s per iteration of the neural network learning process and compared to 16.80 s per iteration of the same application running on the central processing unit. These experimental results are summarized in Fig. 3.
Conclusion As the size of the problems encountered in the modern age continue to grow and become more and more complex, so does the demand for better processing tools, making graphical processing units among the most sought-after tools for large data processing. However, just like the central processing unit, the GPU has its limitations: the memory capacity of the device, in particular, the device/s used in this experiment, fell short during the learning process of a network containing
Abreu Mieses
The Manhattan Scientist, Series B, Volume 5 (2018)
237
Execution time per iteration (T/I)
70 Serial program
60
Parallel programs
50 40 30 20 10 0
0
1
2 3 Neuron network size
4
5 Millions
Figure 3. Comparison of GPU vs. CPU processing time
524,288 hidden neurons, therefore requiring the implementation of the application on multiple GPU devices in order to achieve the capacity of the CPU. Nevertheless, this is but a small setback that can be easily overlooked when conducting large experiments and the participants have access to multiple devices.
Acknowledgments This work was supported by the School of Science Summer Research Scholars Program. The author thanks Dr. Igor Aizenberg for his guidance during this research.
References [1] V. Jalili-Marandi and V. Dinavahi. “Large-Scale Transient Stability Simulation on Graphics Processing units,” IEEE Xplore Conference: Conference: Power & Energy Society General Meeting, 2009. PES ’09. doi: 10.1109/PES.2009.5275844 [2] R. Menéndez de Llano and J. L. Bosque. “Study of neural net training methods in parallel and distributed architectures,” Dep. Electrónica y Computadores Universidad de Cantabria. Av. los Castros S/N, 39.005 Santander, Spain, p. 9, 2008. [3] E. Aizenberg and I. N. Aizenberg. “Batch linear least squares-based learning algorithm for MLMVN with soft margins.” 2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) (2014): 48-55. [4] G. Apruzzese, M. Colajanni, L. Ferretti, A. Guido and M. Marchetti, ”On the effectiveness of machine and deep learning for cyber security,” 2018 10th International Conference on Cyber Conflict (CyCon), Tallinn, Estonia, 2018, pp. 371-390.
238
The Manhattan Scientist, Series B, Volume 5 (2018)
Abreu Mieses
[5] D. Kalamatianos, P. Liatsis and P. E. Wellstead. “Near-infrared spectroscopic measurements of blood analytes using multi-layer perceptron neural networks,” 2006 International Conference of the IEEE Engineering in Medicine and Biology Society, New York, NY, 2006, pp. 35413544. [6] I. Aizenberg. Complex-Valued Neural Networks with Multi-Valued Neurons. 1st ed. Vol. 353. 353 vols. Studies in Computational Intelligence 353. Springer, 2016. [7] I. Aizenberg, C. Moraga, and D. Paliy. “A Feedforward Neural Network based on Multi-Valued Neurons,” In Computational Intelligence, Theory and Applications. Advances in Soft Computing, XIV, B. Reusch, Ed., Springer, Berlin, Heidelberg, New York, pp. 599-612, 2005. [8] I. Aizenberg and C. Moraga. “Multilayer Feedforward Neural Network Based on Multi-Valued Neurons (MLMVN) and a Backpropagation Learning Algorithm,” Soft Computing, vol. 11, No 2, January 2007, pp.169-183.
Speed up of big data encryption on GPU using CUDA Zi Xin Chiu∗ Department of Computer Science, Manhattan College Abstract. Homomorphic encryption provides a solution for delegating computations to the cloud while preserving the confidentiality of sensitive data. As big data sets grow, encryption speed becomes more and more important for practical homomorphic encryption schemes. In this paper, we study the performance of Paillier cryptosystem, a partially homomorphic cryptosystem that allows to perform sums on encrypted data without decryption first. The implementations were done both on CPU and GPU, and thus to compare the speedup gain. Our result shows that, when dealing with large amounts of data (∼100K messages), the CUDA (Computer Unified Device Architecture) implementation on GPU gains approximately 690 speed up factor over the sequential CPU implementation.
Introduction With the rapid development of network information technology in the last decade, more and more digital information is being collected from the internet. Such large volume of data is called big data. Since big data is difficult to process and analyze using traditional methods, cloud services are used to solve problems such as storage capacity, file transfer, resource sharing, etc. Since data are often sensitive, their owners would prefer to keep them confidential. Otherwise, the cloud provider or attacker who has unauthorized access to the cloud may analyze the client data over a long period and extract sensitive and valuable information (data-mining based attack). Such security concerns can be mitigated by keeping all the data encrypted in the cloud. However, if the data owners want to perform calculations on the data stored on a remote server, they have to provide a secret key to the server to decrypt the data before performing the required calculations. This may affect the confidentiality of the client’s data stored in the cloud. Homomorphic encryption allows computation on ciphertext without knowing the secret key (without decryption). It provides a solution to outsource computations to an untrusted cloud while protecting the confidentiality of the users’ sensitive data. Fig. 1 gives a graphical illustration of the application of homomorphic encryption method on the cloud computing security. The Paillier encryption scheme [1] is named after Pascal Paillier who invented it in 1999. The Paillier cryptosystem has the additive property for homomorphic encryption. It is capable of performing calculations on encrypted data in different scenarios such as machine learning on encrypted data, encrypted SQL databases and e-voting. Although Paillier’s scheme is the most efficient additively homomorphic schemes among currently known, the naive implementation is still not efficient enough for the practical setting. To ensure a certain level of security, the cryptographic schemes usually require a large key size, e.g. at least 1024 bit in length, which leads to long encryption/decryption time. ∗
Research mentored by Miaomiao Zhang, Ph.D.
232
The Manhattan Scientist, Series B, Volume 5 (2018)
Data Owner 12 8 20
12 8 20
Encryption Encryption Decryption
HomEnc HomEnc HomDec
Unsecured Channel
Ciphertext 1 Ciphertext 2 Ciphertext 3
Chiu
Cloud Server (untrusted) Decryption Decryption
Encryption
Ciphertext 1
12 8
12+8
20
F(Cipher1, Cipher2)
Ciphertext 2 Ciphertext 3
Figure 1. Security of Cloud Data
In this work, we implement the Paillier’s cryptosystem in C++ on CUDA, which enables GPU for general purpose processing. The primary objective of the paper is to get a comparison between GPU and CPU performance for a specific security application, secure data in cloud computing using homomorphic encryption. The paper is organized as follows: description of the Paillier’s cryptosystem, its homomorphic property and the C++ implementation details; discussion of the GPU principle and the CUDA implementation details; presentation of the results and analysis of performance tests; and finally, our drawn conclusions.
Paillier cryptosystem The Paillier cryptosystem is a partially-homomorphic cryptosystem. Assume the usual setting in Public Key cryptosystem with the public key (PK) and secret key (SK): Encryption: EncP K (m) Decryption: DecSK (m) such that DecSK (EncP K (m)) = m A scheme is said to be homomorphic if, for a function F, we have F(EncP K (m)) = EncP K (F(m)) homomorphic for any F which is fully homomorphic encryption (FHE) and homomorphic for F of limited complexity which is partial homomorphic (PH). The current FHE schemes are extremely inefficient for any real application, therefore, we are interested in working with the Paillier cryptosystem, a PH scheme, which has an additive homomorphic property. Although PHE schemes are in general more efficient than FHE, mainly because they are homomorphic with respect to only one type of operations, addition or multiplication, the performance of Paillier is not good enough for processing high throughput use cases, such as encrypting large volumes of data in the cloud. So we aim to accelerate the Paillier’s cryptosystem with GPU. Key generation The Paillier scheme is described as below:
Chiu
The Manhattan Scientist, Series B, Volume 5 (2018)
233
Private keys: p and q are large prime numbers with equal length λ = lcm(p − 1, q − 1) µ = (L(g λ mod n2 ))−1 mod n, where L(x) = x−1 n Public keys: n = p · q $
Encryption: Decryption:
g ← Z∗n2 message: 0 ≤ m < n random number: 0 < r < n cipher: c = EncP K (m) = g m · rn (mod n2 ) cipher: c < n2 message: m = DecSK (c) = L(cλ mod n2 ) · µ (mod n)
The parameter g has to also fulfill the rule gcd( g
λ
mod (n2 −1) , n) n
= 1 in order for it to be valid.
Homomorphic property A famous application of the additive homomorphic properties that the Paillier cryptosystem provides is in electronic voting. We need to develop a scheme such that third parties are unable to read the elector’s vote but authorities can still have the result of who won the election. The following is how this identity is described: m1 + m2 (mod n) = DecSK (c1 · c2 ) (mod n2 ). Basically, the result of the multiplication of two ciphertexts equals to the ciphertext of the result of the addition of two messages. Such additive homomorphism allow us to perform anonymous calculation when the environment is not trusted. Thus, base on the background we have issued any unauthorized access will not to able to retrieve our sensitive data. Paillier implementation In this section, we discuss the implementation of Paillier cryptosystem in C++. As the private keys p and q that we mentioned, they have to be large prime numbers with equal length where the larger it is, the safer it gets. We chose to use the datatype unsigned long long that allows us to store integer value up to 264 − 1, which is the biggest integer value it can support in standard C++.
The pseudo-random number generator (PRNG) that the default < cstdlib > library in Visual Studio 2017 provides does not work well with the Paillier cryptosystem. First, the maximum value that the random number generator is able to get is 32767, which is a very small number corresponding to the Paillier cryptosystem. Second, the full period, i.e. the different states that the PRNG passes by until it goes back to the seed state for any given seed, of this PRNG is very short, which makes the value not randomize enough. Given these reasons, we decided to use another PRNG, the Mersenne Twister (mt19937), which can generate a greater value compared to the previous one and a much longer period, which makes up for the weaknesses of the former PRNG. One problem that we have encountered during the implementation of Paillier cryptosystem is overflow in fast modular multiplication/exponentiation. In our case, since the maximum value that
234
The Manhattan Scientist, Series B, Volume 5 (2018)
Chiu
can be stored is 264 − 1, the direct multiplication of two very large numbers before modular may exceed the limit and result in an overflow. To avoid overflow, we can multiply the two numbers recursively: Let a and b be two variables. We first calculate ab/2, then double the result. To calculate ab/2, we first calculate ab/4 and then double it, and so on. With this method, the resulting value will never exceed the maximum limit. We utilized the extended Euclidean algorithm for the modular multiplicative inverse to calculate the private key, µ. This algorithm finds the integer coefficients x and y such that a · x + b · y = gcd(a, b).
The algorithm updates the results of gcd(a, b) by the result of the recursive call gcd(b%a, a). The values of x and y are updated in each recursion using the expressions x = y1 − (b/a) · x1 y = x1 . The extended Euclidean algorithm is especially useful when a and b are co-prime. Since x is the modular multiplicative inverse of a mod b, and y is the modular multiplicative inverse of b mod a. In our case, the calculation of the modular multiplicative inverse is an important step of key generation in the Paillier cryptosystem.
CUDA CUDA is a common programming language that allows parallel computing on GPU from NVIDIA and is the software platform that gives direct access to the GPU’s parallel computational elements and instruction set. Since the development of Graphics Processing Units (GPUs), a processor with thousands of arithmetic cores usually used in generating images and 3D computations, have become faster every year, this language has also improved. A lot of developers utilized CUDA to improve their heavily weighted computations and applications. For our application, we used CUDA C++ to accelerate the encryption of huge amounts of data with the Pailler cryptosystem on CUDA-capable GPUs. How CUDA works First, we assign memory space in the GPU. Second, we copy our data to the GPU’s memory. Third, the CPU instructs the GPU to execute parallel computing in each core in the GPU. Fourth, we copy the result from the GPU’s memory back to the CPU’s memory. Fifth, after all the calculations are done, we free the memory that is previously assigned. CUDA functions In the program, all the functions [2] are separated into device side functions, which can only run on the GPU, and host side functions, which can only run on the CPU. Note that all functions of the normal C++ libraries are considered as host side functions. Therefore, starting from scratch, we have to define most of the functions in both devices and host side functions. The syntax of defining
Chiu
The Manhattan Scientist, Series B, Volume 5 (2018)
235
these functions are adding device and host in front of the function. Luckily, they are able to be defined in the same function, so we do not need to write the same function twice. Also, there exists a global function with the syntax global that instructs the GPU’s works and the function calls. For example, when we encrypt a message on the GPU, the global function instructs the GPU to call the encryption function, which is written to be a device function, and calculates the message that is passed from the parameters. Blocks and threads utilization To provide data parallelism, a multithreaded CUDA application is partitioned into blocks of threads that execute independently (and often concurrently) from each other. The configuration happens when you call the global function in the CPU as function name <<< Blocks, Threads >>> (parameters). Block indicates how many cores run at the same time and thread indicates how many tasks each core is working on. An analogy would be how many people are working on how many tasks at the same time. The best way to configure the number is the multiplication of blocks and threads equals the total tasks that you have and the blocks number being as large as possible. Threads indexing When we have different cores working at the same time, we have to provide the index of the element to show which data each core is working on. The usual way of using the index from a for-loop does not work because every core works on a different index. Thus, we use the following equation [2] to indicate the index threadID = blockIdx ∗ blockDim + threadIdx.
Here, blockIdx indicates the index of the block or core. The syntax blockDim indicates how large the block is, which is the threads that we specified earlier. The syntax threadIdx indicates which element every core is working on. In summary, it helps the global function to properly instruct the GPU to run parallel computing.
Experimental results and analysis We compared the performance of the Paillier’s cryptosystem on both the CPU and the GPU using an Intel(R) CoreTM PC (i7-8550U, 8GB RAM with a Windows 10 64-bit operating system) and a NVIDIA GPU (NVIDIA GeForce GTX 1080, 8GB on device memory). In the experiment, 100K 24-bit long messages are processed. We measured and compared the execution time for four functions: key generation, encryption, decryption and module multiplication (see Table 1). As expected, the usage of GPU significantly improves the speed of execution of encryption, decryption and module multiplication. The improvement factor increases with the size of input data. On our input data (100K messages), the GPU implementation of Paillier was more than 690 times faster than the CPU implementation.
236
The Manhattan Scientist, Series B, Volume 5 (2018)
Chiu
Table 1. Experimental results
KeyGen Enc Dec Mult
CPU
GPU
Speedup
2-10ms 29s 8s 190ms
3-10ms 42ms 26ms 617µs
∼1 690.5 307.7 307.9
We also observe that the key generation time on both processors are nearly the same and sometimes even a little bit longer for the GPU implementation. This is reasonable because the key generation is an independent, randomized, and non-parallelizable phase. In addition, the data need to be transferred between device (GPU) and host (CPU), which results in additional processing time (∼1ms).
Conclusion and future work In this paper we focused on Paillier encryption for big data. We presented both a classical CPU-based implementations of Paillier cryptosystem, and the GPU-based implementations using CUDA. We ran experiments and compared the performance between the two implementations. The results show that the Paillier encryption for large amounts of data can be significantly accelerated on the GPU. The GPU is designed to do multiple tasks simultaneously; if the tasks can be made massively parallel, then the GPU computation will be several orders of magnitude faster. Our results demonstrate that the encryption of massive amounts of data is greatly accelerated because messages can be encrypted at the same time. Our study has demonstrated that GPU computation has a lot of advantages. However, the data type we used in our Paillier implementations can only support a maximum of a 64-bit long integer. As a next step, we would like to work on a CUDA library that supports arbitrary precision arithmetics and allows us to generate and process integers that are larger than 264 . Also, since the experiments were run in Visual Studio 2017 on a Windows PC, we are interested in running a test on a Linux system such as the Dionysus server of the Manhattan College School of Science. Of course, we will need to first deal with the problems caused by the incompatibility between CUDA versions and C++ compilers.
Acknowledgment This work was supported by the School of Science Summer Research Scholars Program.
References [1] C. Jost, H. Lam, A. Maximov, and B. Smeets. Encryption Performance Improvements of the Paillier Cryptosystem, Cryptology ePrint Archive, Report 2015/864, 2015 [2] M. Harris. An Even Easier Introduction to CUDA 2017: NVIDIA Developer Blog
Implementation and evaluation of LPN-based authentications Eric Ciccotelliâ&#x2C6;&#x2014; Department of Computer Science, Manhattan College Abstract. In this paper, we study the Learning Parity with Noise (LPN) Problem. We implemented two LPN-based symmetric key authentication protocols, HB and HB+. Our results show that the LPN-based protocols are fast and efficient. We also evaluated the HB protocol. On the practical side, we implemented an active attack on H. On the theoretical side, we presented the calculation for optimizing the soundness and completeness parameters. To our best knowledge, our paper is the first to give such a detailed analysis on the security parameters for the HB protocol.
Introduction In information security, authentication is the process of identifying an individual or data. A simple example of this could be a web browser and a website, which get authenticated when the page loads in. Secure authentication is important because it can prove that data is being sent from the right location, and that the data is in fact real. In the two-party communication setting, entity authentication can be achieved through the cryptographic challenge-response protocols, in which the prover proves its identity to the verifier by demonstrating knowledge of a secret known to be associated with that entity. This is done by providing a response to a time-variant challenge, where the response depends on both the entities secret and the challenge. Such interactive entity authentication protocols can be based on symmetric key techniques and public-key techniques. The internet of things (IoT) is the network of physical devices, vehicles, home appliances and computing devices including some smart objects. Many IoT devices, e.g. security cameras, small speedometer devices in cars, smartphones and even RFID readers in office buildings and college campuses, are usually smaller in size and have limited power supplies and storage capacity, which makes it problematic to deploy traditional authentication techniques. With increasing popularity, the IoT industry is requiring many devices to be secure and safe for all users while still maintaining speed and efficiency. In transferring data between two entities during the authentication process, the entities are vulnerable to attacks which may compromise data and put the entities in control of an attacker. Studying the underlying protocols that control authentication between two entities as well as their attacks may lead to new methods of securing data and allowing entities to securely authenticate each other in an efficient manner. Motivation. In a resource constrained environment, standard cryptographic algorithms, which are critical for ensuring the authenticity and integrity of communications, can be too slow, too big or too energy-consuming. For this reason, studying lightweight cryptographic solutions for resourceconstrained devices is therefore an important, practical and major theoretical research challenge. â&#x2C6;&#x2014;
Research mentored by Miaomiao Zhang, Ph.D.
238
The Manhattan Scientist, Series B, Volume 5 (2018)
Ciccotelli
Since many lightweight protocols are designed and used for many applications, including the IoT, with rising popularity in recent years, the need for secure, efficient and low cost authentication protocols has also grown. As these technologies become more mainstream, they are scrutinized and attacked by people with possible malicious intent. This is why it is important to study these protocols and to understand the attacks on them, in order to protect them. Additionally, certain protocols have specific benefits and use cases which also can be studied and utilized. Some protocols may be more secure but less efficient than others; therefore, different authentication protocols should be used for different cases. Goals and objectives. In this project, our objective is to implement and analyze two LPN-based authentication protocols using C++. The protocols to be implemented, evaluated and tested are HB and HB+. We will compare the computational cost, storage and attacks of each LPN-based authentication scheme to get a better understanding of the LPN-based protocol design. These protocols each have vulnerabilities in the form of various attacks, which we expect to implement and test. These attacks include the passive, active and man in the middle (MIM) attacks. In the following, we start by briefly describing the LPN problem, continue by describing the HB and HB+ protocols we implemented and evaluated, provide our implementation details and analysis, and finally, we summarize the results and present our conclusion.
The learning parity with noise problem The learning parity with noise (LPN) problem was introduced by Angluin and Laird [1]. The problem involves a binary secret vector x of length n, and an adversary who is given a number of LPN samples. Each sample has the form (a, ha, xi â&#x160;&#x2022; e), where a is a uniformly random vector, e is a bit that is 1 with probability . The attackerâ&#x20AC;&#x2122;s goal is to compute x or distinguish the samples from completely random (in the decision version of the problem). Because of the computational simplicity (bit XOR arithmetic) and strong security guarantee (no efficient algorithm to solve it), LPN has been intensively studied in building efficient (or say lightweight) authentication protocols for the resource-constrained scenario such as IoT. It has been used in the McEliece cryptosystem, as well as authentication protocols such as HB and its subsequent work. Let n, m â&#x2C6;&#x2C6; N with n â&#x2030;¤ m and Ber be the Bernoulli distribution over Z2 with parameter â&#x2C6;&#x2C6; $ [0, 1/2). An instance of (n, )-LPN problem is a pair (A, b) where A â&#x2020;? Z2mĂ&#x2014;n and b = A ¡ x â&#x160;&#x2022; e $
is computed from the secret x â&#x2C6;&#x2C6; Z2n and error vector e â&#x2020;? Berm . The goal is to recover the secret vector x n z }| {          ďŁ A  ¡ ďŁx â&#x160;&#x2022; ďŁe = ďŁb . m 
The LPN-problem is hard to solve. This is due to the noise that gets added (vector e) after the matrix and vector multiplication. Without noise, the problem is simply a vector or matrix multi-
Ciccotelli
The Manhattan Scientist, Series B, Volume 5 (2018)
239
plication problem, which is very simple to solve using basic linear algebra (Gaussian elimination). However, with introduced noise, the problem becomes very hard. In the worst case, it is equivalent to the notorious problem of decoding random linear code, which is NP-hard. In the average case, the BKW [2] algorithm and its variants are the most effective algorithms in solving the LPNproblem. But those would still require at least 2Θ(n/ log n) samples and an asymptotic running time of 2Θ(n/ log n) . The LPN-problem is also claimed to resist known quantum computing attacks.
The HB and HB+ protocols Work on LPN-based authentication protocols began with Hopper and Blum [3], whose HB protocol was later proven to be secure against passive attacks assuming the hardness of LPN. The original motivation for the HB protocol was to enable unaided human authentication: for the protocol to be simple enough to be carried out without the help of a computational device. Subsequent work has found that the key sizes and error rates required to ensure security may be too large for humans to employ with ease comparable to, say, password-based authentication. Nevertheless, as noted by Juels and Weis [4], HB-type protocols are lightweight enough to be potentially applicable in the RFID setting. There is a long line of research devoted to devising efficient secret-key authentication protocols based on the hardness of LPN problem and its variants. In this section, we will present the original HB protocol and the HB+ protocol [4]. The HB and HB+ protocols consist of k = poly(n) iterations of what is known as a “basic authentication step.” The protocols are executed by two parties: the prover Pand the verifier V.
The key for HB is a vector x of length n, where n is the security parameter. For HB+ , the key consists of two vectors x, y of length n. For i ∈ [k], ai , bi ∈ Fn2 are column vectors used in the execution.
In HB, as shown in Fig. 1, a prover P and a verifier V share a random secret key x ∈ Fn2 . In the i-th authentication step, the verifier sends a random challenge ai ∈ Fn2 to the verifier, and the $ prover replies with zi = ai>x ⊕ ei , where ei ← Berε . P
$
ei ← Berε
zi = ai>x ⊕ ei
ai
←−−−−− zi
−−−−−→
V $
ai ← Fn 2
wi = ai>x
Figure 1. HB (the i-th authentication step)
HB+ adds a second secret y ∈ Fn2 and a third round, as shown in Fig. 2.
In both HB and HB+ , at the end of k steps, V checks to see what fraction of answers zi were correct. If more than k · u(ε) are correct, for u(ε) some function of ε, then the verifier accepts. Otherwise, the verifier rejects. k and u(ε) should be set high enough to allow the honest prover to authenticate w.h.p., but low enough that a malicious third party should not be able to authenticate
240
The Manhattan Scientist, Series B, Volume 5 (2018) P $
bi ←
V
bi
Fn 2
−−−−−→ ai
$
ei ← Berε
zi = ai>x ⊕ bi>y ⊕ ei
Ciccotelli
←−−−−− zi
−−−−−→
$
ai ← Fn 2
wi = ai>x ⊕ bi>y
Figure 2. HB+ (the i-th authentication step)
by randomly guessing. In particular, as noted in [5], for both HB and HB+ , u(ε) = (1+τ )ε suffices to achieve completeness error negligible in the security parameter, for any positive constant τ . We can use matrix notation to simplify working with the HB and HB+ protocols in parallel, as shown in Figs. 3 and 4. We adapt the parallel notation in the rest of the paper. Let A, B ∈ Fn×k be 2 matrices for which ∀i ∈ [k], Adi = ai , Bdi = bi . That is, the i-th columns of A, B respectively are the vectors ai , bi respectively. Then in the HB protocol, for example, V sends the challenge $ $ A ← F2n×k . P replies with z = A>x ⊕ e, where e ← Berkε . V computes w = A>x, and accepts iff |z ⊕ w| ≤ uHB (ε). $
P
Berkε
e←
z = A>x ⊕ e
A
$
V
A ← F2n×k
←−−−−− z
w = A>x verify(z, w)
−−−−−→
?
= |z ⊕ w| ≤ k · uHB (ε) Figure 3. HB (Parallel notation)
$
P
B← $
e←
Fn×k 2
Berkε
z = A>x ⊕ B>y ⊕ e
V
B
−−−−−→ A
←−−−−− zi
−−−−−→
$
A ← F2n×k
w = A>x ⊕ B>y verify(z, w) ?
= |z ⊕ w| ≤ k · uHB+ (ε) Figure 4. HB+ (Parallel notation)
Implementation and Analysis In this section, we discuss the implementation of HB, HB+ , and the active attack on HB. We also present a detailed analysis on the security parameters of the HB protocol. Vector representation As described in the previous sections, the LPN-based protocol is light-weight because only scalar dot product operation of binary vectors and bit XOR operation are adopted. As we want to
Ciccotelli
The Manhattan Scientist, Series B, Volume 5 (2018)
241
implement the protocols and attacks in C++, we chose the vector structure in the Standard Template Library (STL) to represent binary vectors used in the protocols, e.g. the shared secret key x, the random noise vector e, and the binary matrix A can be represented as vector of binary vectors. The declaration statements for an integer vector and a 2D matrix are as follows: vector <int> vect; vector < vector <int> > matrix. STL vectors in C++ are very similar to arrays in C, however they are dynamic. Additionally, the memory space for the vector gets automatically de-allocated when the variable goes out of scope, making memory cleanup much simpler than a dynamic array. Also, due to the nature of vectors, it is simpler and easier to perform arithmetic operations on them over an array. Pseudo-random number generator (PRNG) A pseudo-random number is a number that is not truly random, but appears random. PRNG refers to an algorithm that uses mathematical formulas to produce sequences of random numbers. PRNGs generate a sequence of numbers approximating the properties of random numbers. PRNG is widely applied in many fields of computer science, in particular, it plays a very important role in Cryptography. One way to generate random numbers in C++ is to use the function rand(). This rand function takes no input arguments and returns an integer that is a pseudo-random number between 0 and RAND MAX, where each has an equal chance (or probability) of being chosen. (In reality, this is not the case, but it is close). Since PRNG starts from an arbitrary starting state using a seed state, we need to call the srand() function in C++ to seed the pseudo random number generator used by the rand() function. The PRNG will always produce the same sequence when initialized with the same seed. Roughly speaking, the number of outputs until the internal state repeats is the period of a PRNG. Generally, a period should be as long as possible. The size of its period is one of the important factors in the cryptographic suitability of a PRNG. The C++ rand() function is actually bad and not enough for cryptographic purposes. We adopted a better PRNG called Mersenne Twister (mt19937) which has a famously long period of 219937 â&#x2C6;&#x2019; 1.
In our implementation, we need to simulate independent draws from a Bernoulli distribution with a given probability Îľ or independent draws from a uniform distribution on a set {0, 1}. We adopted the class bernoulli distribution, for generating samples which true (1) with probability p and false (0) with probability 1â&#x2C6;&#x2019;p. This class can also be used directly for generating samples from the uniform distribution when setting the probability p = 12 . Vector/matrix multiplication Due to the design of HB and HB+ protocols, we need to calculate the result for multiplying two binary vectors or multiply a binary vector and a binary matrix. But after multiplication when bits are added, we need to perform the bit XOR operation rather than normal addition. Since there is no predefined build-in function in C++ to perform such operations, we defined and implemented
242
The Manhattan Scientist, Series B, Volume 5 (2018)
Ciccotelli
our own functions which use standard mathematical practices, but replace addition with XOR as shown in the equation below. A1 A2 B1 A1 ¡ B1 â&#x160;&#x2022; A2 ¡ B2 ¡ = A3 A4 B2 A3 ¡ B1 â&#x160;&#x2022; A4 ¡ B2 Attack on HB Besides the implementation of HB and HB+ protocols, we also implemented an attack on the HBprotocol. This attack is an active attack, in which the attacker in the first phase may observe several sessions between an honest prover and an honest verifier, and run many authentication protocols with the prover. But the attack cannot interact with the verifier. Afterward, without access to the prover, the adversary tries to fool the verifier in the second phase. The active attack on HBcan be summarized as follows. The attacker A can pick a special challenge to reveal partial information of the key, e.g., 1. A chooses a unit vector a = (1000 . . . 0)>. 2. P responds with z = a>x â&#x160;&#x2022; e = x1 â&#x160;&#x2022; e. 3. A repeats the procedure from step 1 multiple times and takes the majority vote of the collected zâ&#x20AC;&#x2122;s to recover x1 , the first bit of the key. The attacker A repeat the above steps for i = 1, 2, 3, . . . , k to recover the whole key x. In our implementation, the number of repetitions for attacking one bit is set at 200. Soundness and Completeness error We are now ready to present our analysis on the security parameters for the HB protocol. For a two-party interactive protocol like HB, the soundness error occurs when a prover making â&#x2C6;ź â&#x2C6;ź random answers succeeds in authenticating itself. Given a random guess z, the distribution z â&#x160;&#x2022;Ax is also uniformly random regardless of the distribution of Ax. The weight of the resulting vector, â&#x2C6;ź |z â&#x160;&#x2022; Ax| is distributed as the sum of independent Bernoulli of bias 1/2. Let Yi denote the i-th P â&#x2C6;ź â&#x2C6;ź bit of (z â&#x160;&#x2022; Ax), and let Y = ki=1 Yi = |z â&#x160;&#x2022; Ax|. Then Y â&#x2C6;ź B(k, 21 ). We can compute the probability of soundness error as follows: h i â&#x2C6;ź Pr |Ax â&#x160;&#x2022; z| â&#x2030;¤ n ¡ ÂľHB ( ) = Pr [|Y | â&#x2030;¤ n ¡ ÂľHB ( )] " n # X = Pr Yi â&#x2030;¤ n ¡ ÂľHB ( ) i=1
2
=e
â&#x2C6;&#x2019;( 12 â&#x2C6;&#x2019;ÂľHB ( )) ¡n
= 2â&#x2C6;&#x2019;ds ¡n The calculation above applied the Chernoff-Hoeffding bounds. Let ds denote the coefficient of the n term in the exponent.
Ciccotelli
The Manhattan Scientist, Series B, Volume 5 (2018)
243
A completeness error occurs when an honest prover is rejected. Similarly, we compute the probability of completeness error as follows. Given a correctly computed z, let Y = z = Ax â&#x160;&#x2022; $ e, e â&#x2020;? BernÎľ , and Yi denote the i-th bit of Y . Then, " n # X Pr [|Ax â&#x160;&#x2022; z| â&#x2030;Ľ n ¡ ÂľHB ( )] = Pr Yi > n ¡ ÂľHB ( ) = Pr â&#x2030;¤e
" i=1 n X
Yi > ¡ n ¡ (1 + Ď&#x201E; )
i=1 â&#x2C6;&#x2019;Ď&#x201E; 2 ¡n¡ /3
#
= 2â&#x2C6;&#x2019;dc ¡n . In the equation above, we use dc to denote the coefficient of the n term in the exponent. In cryptography and security, it is common practice in the literature to set the completeness error to be less than or equal to 2â&#x2C6;&#x2019;40 and the soundness error to be no larger than 2â&#x2C6;&#x2019;80 . So based on the calculations above, if we let 2â&#x2C6;&#x2019;dc ¡n â&#x2030;&#x2C6; 2â&#x2C6;&#x2019;40 and 2â&#x2C6;&#x2019;ds ¡n â&#x2030;&#x2C6; 2â&#x2C6;&#x2019;80 we can get 2 80 1 â&#x2C6;&#x2019; ÂľHB ( ) = ds = log2 e ¡ 2 n 2 40 (Îľ ¡ Ď&#x201E; ) = , dc = log2 e ¡ 3 n and hence decide the parameter n in HB, the number of repetition of one single authentication step (which also represents the dimensions of matrix A).
Results We implement and evaluate the performance of HB and HB+ on a high-end desktop PC (i56500 CPU @ 3.20GHz, 16GB RAM with Windows 8 64-bit Operating System). In the experiment, we set the security parameter as 80 bits and generated the key at length 128. We measure the execution time for the HB and HB+ protocols (see the table below). The values of n used in our simulation are determined by our analysis in the previous section. Our implementation and simulation results show that LPN-based protocols like HB and HB+ are very efficient. They have small communication and computation complexity because they only involve bit-wise XOR operations on binary vectors. Such LPN-based schemes are very suitable for weak-power devices like RFIDs. The simulations on HBand its active attack demonstrated our theoretical analysis above.
Conclusion In this research project, we implemented two LPN-based authentication protocols HB and HB+. Besides the implementations using C++, we presented an analysis on the security parameters
244
The Manhattan Scientist, Series B, Volume 5 (2018)
Ciccotelli
Table 1. Performance of the HB and HB+ protocols Time (ms) Bias,
Itetations, n
HB
HB+
0.05 0.10 0.15 0.20 0.25
641 962 1420 2129 3328
149 223 317 470 746
282 452 647 955 1534
of the HB protocol: based on the soundness and completeness errors, we calculated the minimum number of repetitions for a single authentication step in HB (i.e. the minimum number of rows required for the random matrix A in the protocol. To our best knowledge, our paper is the first to provide such a detailed analysis on this parameter. Our simulation results have demonstrated our theoretical analysis. This work hopes to be a first step toward a better understanding of the LPN-based symmetric key authentications. In the future, we may implement and analyze other LPN-based protocols, e.g. ZNHB, a man-in-the-middle secure authentication protocol; and apply some of our findings to real world applications, such as RFIDs.
Acknowledgment This research was supported by the School of Science Summer Research Scholars Program.
References [1] D. Angluin and P. D. Laird. Learning from Noisy Examples. Machine Learning, 2(4):343â&#x20AC;&#x201C; 370, 1987. [2] A. Blum, A. Kalai, and H. Wasserman. Noise-tolerant learning, the parity problem, and the statistical query model. Journal of the ACM (JACM), 50(4):519, 2003. [3] N. J. Hopper and M. Blum. Secure human identification protocols. In Colin Boyd, editor, Advances in Cryptology - ASIACRYPT 2001, pages 52â&#x20AC;&#x201C;66. 2001. Springer Berlin Heidelberg. [4] A. Juels and S. Weis. Authenticating pervasive devices with human protocols. In Proc. CRYPTO, pages 293â&#x20AC;&#x201C;308, 2005. [5] J. Katz, J. S. Shin, and A. Smith. Parallel and concurrent security of the HB and HB+ protocols. Journal of Cryptology, 23(3):402â&#x20AC;&#x201C;421, 2010.
Intelligent edge detection using an MLMVN Josh Persaudâ&#x2C6;&#x2014; Department of Computer Science, Manhattan College Abstract. This paper will explore the use of the multilayer neural network with multi-valued neurons (MLMVN) as a non-linear filter. MLMVN is a type of complex-valued feedforward neural network. The purpose MLMVN will be fulfilling is its use in imaging filter and in our case, edge detection. MLMVN has been proven to be a successful low-pass filter in the use of noise reduction. The goal of this experimental work is to test how well MLMVN can detect edges on images, and how well MLMVN can reduce noise in images while detecting the image edges. We use MLMVN with a single hidden layer and will process overlapping patches taken from clean images and from noisy images.
Introduction This work employs a multilayered neural network with multi-valued neurons (MLMVN) [1] used as a non-linear filter. MLMVN is a feedforward neural network based on multi-valued neurons whose weights and activation function are complex-valued. MLMVN topology is the same as that of Multilayer Perceptron (MLP), in which nodes of the network do not form a cycle of feedback connections (recurrent). Instead, it only allows information to flow through in one direction. Fig. 1 demonstrates a topology of MLMVN employed in this work.
Figure 1: MLMVN Structure. The input layer does not contain neurons. Figure 1. MLMVN Topology. The input layer does not contain neurons and is used only to distribute input signal among the first hidden layer neurons. Adapted from [2].
While reading this report and analyzing the results, it is imperative to understand what
detection is and what of application employ it. Edge Detection (demonstrated in figure Edge detectionedge (demonstrated intype Fig. 2) is a technique for finding boundaries of objects within 2) isdiscontinuities a technique for findingin boundaries of objectsThere within images detecting discontinuities in images by detecting brightness. are by many different methods that can be used to detect the edges of an image. For this experiment, we will focus on the Sobel Operator [3]. brightness. There are many different methods that can be used to detect the edges of an image, Common applications of edge detection can be found in digital imaging and robotics where the for this experiment, we will focus on the Sobel Operator. Common uses of edge detection can be edges detected in the image are used for identifying objects and moving a robot around a scenery in computer imaging and robotics where the edges in the image are used for identifying with the added use found of image segmentation.
â&#x2C6;&#x2014;
objects and moving robotics around a scenery with the added use of image segmentation. Research mentored by Igor Aizenberg, Ph.D.
objects and moving robotics around a scenery with the added use of image segmentation. 246
The Manhattan Scientist, Series B, Volume 5 (2018)
Persaud
Figure 2: The image on the left is a sample of one of the images used to train MLMVN. The image on Figure 2. The image on the left is a sample of one of the images used to train MLMVN. The image on the right is the right is what it looks when you preform edge detection an operator image using the Sobel Operator. what it looks like when youlike preform edge detection on an image using theon Sobel
Edge detection in images is important for image segmentation and distinguishing image details. While there are many methods of edge detection that work for clean (noise-free) images, there is a lack of methods, which are suitable for noisy images. All devices which are used to acquire real world images are susceptible to noise. This includes both analog and digital devices (cameras, microphones, radar sensors etc.). Noise is usually caused by uncontrollable factors such as heat, lack of light, humidity, etc., In the real world most images have some type of noise. Having a method to preform edge detection on those images while denoising the image simultaneously would be very attractive. In our simulations, we use additive Gaussian noise with the hope that the MLMVN will be able to successfully remove a lot of the noisy texture and perform edge detection while ignoring noise. Additive Gaussian noise is the one most frequently appearing in images. Gaussian noise in images (created from poor lighting, heat, etc.) can be reduced using various filters. Removing Gaussian noise properly is much more difficult than removing impulsive noise. The problem with using a spatial domain filter to reduce the noise in an image is that, while we reduce the noise, we simultaneously smooth out the image causing blurring of the imageâ&#x20AC;&#x2122;s edges and details. The objects in a heavily noisy image do not have sharp edges and lose important details. As a result, any classical edge detection algorithm applied after noise filtering usually fails to detect many edges, especially the ones related to small details, which can be smoothed and distorted by a filter. Hence, the primary goal of this work is to detect the edges of noisy images. In order to test MLMVN as an edge detector applied to noisy images, we must be confident that MLMVN can perform edge detection on non-noisy images.
Persaud
The Manhattan Scientist, Series B, Volume 5 (2018)
247
Methodology Preforming edge detection on clean images In order to train MLMVN and test its capabilities in edge detection, a learning set must be created. To create the learning set for the MLMVN, a set of the same 400 clean grayscale images, employed in [1], was used in this work. The Sobel operator was used to detect edges on these images. The original images were then used as sources for inputs, while their edged counterparts were used as targeting outputs of the neural network. In turn, the same approach was applied to noisy images. This means that Gaussian noise was artificially added to the same 400 images and these noisy images were used to create inputs for the neural network, while the edged images created from the corresponding clean 400 images were used to create desired outputs of the neural network. Once a learning set was created, MLMVN was trained using this set in the same way as described in [1]. To test the MLMVN, a set of 10 test images were used. These images were never introduced in the learning set so it would be the first time the network had seen this image. We used various sizes of patches taken randomly from images to create a learning set (in the same way it was done in [1]). To find the best results, the network was tested with different patch sizes. For edge detection of clean images, this experiment tested two separate sets of input and output patch sizes where the patch size of the clean image was larger than that of the patch sizes for a corresponding edged image to create an overlap of the two patch sizes during the training process. First, an input patch size of 5×5 was used in conjunction with a 3×3 output patch size. Next, this experiment tested an input patch size of 7×7 and an output patch size of 5×5. Both of these used the same number of learning samples (40,000 – 100 patches randomly taken from each of 400 images) and the same number of hidden neurons (2048 – the best results for filtering were obtained in [1] with this number of hidden neurons). Preforming edge detection on Gaussian corrupted images To accomplish the primary goal of this research work, we first tested the MLMVN using clean images. It was only after and if MLMVN was able to perform edge detection on clean images, that this experiment would proceed to test the MLMVN on images corrupted by Gaussian noise. The same images used in clean images were also used in our noisy image test with the exception that Gaussian noise was artificially added to the clean images. The Gaussian noise was generated using σnoise = 0.2σ, where σ is the standard deviation of the image. The same exact edge detected images created using the Sobel operator were used as the MLMVN output. The rest of the methodology from clean image test was used here. Just like with the clean images, different patch sizes were tested to find the best result. First, an input patch size of 5×5 was used in conjunction with a 3×3 output patch size. An input patch size of 7×7 and an output patch size of 5×5 was tested next. Both of these used the same number of learning samples (40,000) and the same number of hidden neurons (2048).
248
The Manhattan Scientist, Series B, Volume 5 (2018)
Persaud
Results Filter of clean images with MLMVN In the following figures (3-7), images labeled with a ‘A’ are the original clean images before any processing was performed. Images labeled with a ‘B’ were filtered with the Sobel operator. Images label with a ‘C’ were filtered using MLMVN after the learning process was finished.
3B
3A
3B
3C
4A
4B
4C
5A
5B
5C
Persaud
The Manhattan Scientist, Series B, Volume 5 (2018)
6A
6B
6C
7A
7B
7C
249
A very interesting result from our experiments is the ability of MLMVN to perform image segmentation when detecting edges. Segmentation is a process of detection and distinguishing areas with the various textures in an image. While the Sobel operator is able to perform edge detection well, it is unable to distinguish between shadows and physical objects. An unexpected result from the MLMVN was that, while it is able to perform edge detection successfully, it is also able to perform image segmentation. In Fig. 3C one can see how it separates the train stationâ&#x20AC;&#x2122;s overhead from the sky behind it. In Sobel, Fig. 3B, there is no separation of physical objects and non-physical objects. This can also be seen in Fig. 4C. When looking at Fig. 5C and comparing it to the original image (Fig. 5A) and the Sobel filtered image (Fig. 5B), one can see how MLMVN was able to detect details in the sky behind the clock. The details, which were detected are unseen by the human eye in the original image and also cannot be detected by the Sobel operator. In Fig. 6C when compared to its Sobel counterpart (Fig. 6B), one can see that unlike with Sobel, MLMVN was able to separate the airplanes underbody shadow as well as the photographers shadow. In the Sobel version, the shadow of the airplane appears to be physically part of the airplane. MLMVN was also able to detect very minor intensity changes in the sky whereas Sobel was not. MLMVN was able to preserve details of the sky (Fig. 7C) better than Sobel (Fig. 7B).
250
The Manhattan Scientist, Series B, Volume 5 (2018)
Persaud
Filtering of Gaussian corrupted images using MLMVN In the following figures (8-12), images labeled with an ‘A’ are images corrupted with Gaussian noise before any filtering was performed. Images labeled with a ‘B’ were filtered with the Sobel operator. Images label with a ‘C’ were processed using MLMVN after the learning process was finished. MLMVN in this capacity was used to perform noise filtering and edge detection simultaneously.
8A
8B
9A
9B
10A
10B
8C
9C
10C
Persaud
The Manhattan Scientist, Series B, Volume 5 (2018)
11A
11B
11C
12A
12B
12C
251
With the Gaussian noise added to the input images (labeled with an â&#x20AC;&#x2DC;Aâ&#x20AC;&#x2122;), the Sobel operator had a much harder time distinguishing edges of objects and edges of the noisy pixels. The Sobel operator was nevertheless still able to perform edge detection on the noisy images. However, a disadvantage is that sharply detected edges of noisy textures prevent a full analysis of an edged image. Small details become indistinguishable behind noisy edges. MLMVN did a great job at detecting the edges of the objects while ignoring a lot of the noise. MLMVN is still able to perform image segmentation with the noisy images. Take Fig. 8C for an example, the shadows of the airplane and the photographer are still at a different intensity in the edged image when compare to Sobel (Fig. 8B). In Fig. 9C, you can see once again, MLMVN performing image segmentation. MLMVN did a much better job detecting the edges in the noisy image while Sobel in Fig. 9B did a terrible job and the image is nearly unusable. In Fig. 10C, you can see a small side effect of MLMVN. When you look at both Fig. 10C and compare it to Fig. 10B, you can see MLMVN smoothed out details in the buildings outside wall. MLMVN nevertheless still did a great job with the edge detection. In Fig. 11B, Sobel performed edge detection primarily on the noisily pixels whereas in Fig. 11C, MLMVN reduces the majority of the noise before it preformed edge detection. This left a more useable image as a result. When comparing Fig. 12B and Fig. 12C, you can see that Sobel could not differentiate between the main building and the people and objects
252
The Manhattan Scientist, Series B, Volume 5 (2018)
Persaud
around the building. Sobel made all of the objects appear to be connected as if they were one large object. MLMVN was able to differentiate between all of the different objects. The table umbrella, car, trees, humans and sky were at a different intensity than the main building.
Conclusion Our main conclusion from this work is that MLMVN is able to perform edge detection on both noisy images and clean images. In both cases, MLMVN outperformed the Sobel operator. There was no case in which Sobel performed image segmentation. Sobel also did not do a great job when it had to detect edges in a noisy image. MLMVN was able to perform edge detection along with image segmentation. There was no initial expectation of MLMVN to perform image segmentation; this was just a bonus. The only downside to MLMVN when it came to edge detection was when it had to perform edge detection on noisy images. While MLMVN did smooth out the edges in the images, it was still able to accurately detect the edges in the images better than Sobel for noisy images. Sobel performed edge detection on pixels that were added artificially in the image. In the real world, where a computer may need to be using edged images in tracking movements and depth perception, Sobel may confuse the computer since it is adding information into the images that is not really there. Thus, this experimental work was successful. The end results for clean images came out better than expected and the results for the noisy images were very good. To further improve the MLMVN’s ability to reduce noise while preforming edge detection, the learning set needs to be expanded through including much more samples from more images.
Acknowledgments The author would like to thank the School of Science Summer Research Scholars Program for financial support and Dr. Igor Aizenberg for his mentorship.
References [1] Aizenberg I., Ordukhanov A., and O’Boy F., “MLMVN as an Intelligent Image Filter”, Proceedings of the 2017 IEEE International Joint Conference on Neural Networks (IJCNN 2017), Anchorage, May, 2017, pp. 3106-3113. [2] Chowdhury, A. W. “Neural Networks in Business” Medium, 31 Aug. 2017, medium.com/ oceanize-geeks/neural-networks-in-business-fd89d7afd490. [3] Gonzales, R. C. and Woods R. E., “Digital Image Processing”, 3rd Edn. Prentice Hall, 2008
Reconstruction of atmospheric CO and the stable isotopes δ 13 C and δ 18O over the last 8,000 years Sophia Misiakiewicz∗ Department of Chemistry and Biochemistry, Manhattan College Abstract. This study was an analysis carbon monoxide (CO) concentration and isotopic enrichments from ice cores from the South Pole, Antarctica. Through the use of cryogenic vacuum extraction and a Finnegan MAT 253 Isotope Ratio Mass Spectrometer, CO was isolated. The ice cores are dated based on accumulation rates and seasonality of the South Pole location, going back over 7,000 years before present. Carbon monoxide concentrations are corrected for line background signals and sensitivity of the instrumentation, and then were calculated based on sample volume and peak area. There is little variability in the calculated concentration, with a higher expected average compared to modern day values. An isotopic mass balance is used to calculate the sources of CO based on CO’s chemical influences in the atmosphere. As major sources, there is a minimal influence of methane oxidation, and an inverse relationship observed between biomass burning and nonmethane hydrocarbon (NMHC) oxidation.
Introduction Earth’s atmosphere has varied over time, yet the climatic changes occurring today are at an extremely rapid rate. This deviation indicates a significant issue with the composition of the Earth’s atmosphere. The composition and chemistry of Earth’s atmosphere has direct implications for living organisms. The atmosphere not only provides protection from harmful ultraviolet radiation, but sustains life by providing important gases such as oxygen, which we breathe, and carbon dioxide, which plants use as a source for photosynthesis. As the composition of the atmosphere changes due to anthropogenic influences, such as fossil fuel use and deforestation, all living things will be affected. The study of the current atmospheric conditions coupled with the analysis of older atmospheric conditions allows for scientists to estimate future conditions and understand how these conditions could affect the planet. In order to predict future effects of changing atmospheric composition on our planet, it is necessary to determine what could have caused it. Carbon Monoxide (CO) has a life time of a few weeks to months. This lifetime span is long enough to allow CO to travel hemispheric distances, yet short enough for it to still vary seasonally, and interannually. CO is also a fundamental sink for the hydroxyl radical (OH), which is essential to the lifetimes of methane, hydrofluorocarbons, and tropospheric ozone. The hydroxyl radical varies based on the distribution of CO, CO2 , NOx , and CH4 in the atmosphere (Mullaley, 2016). It also varies seasonally since it is excited by UV radiation, which allows it to react. In Antarctica, it is dark for half of the year. In the summer, CO lifetimes are short due to it’s immediate reaction with OH and the presence of UV light. In the winter, CO is in abundance due to the annual solar radiation cycle producing less excited OH radicals (Mullaley, 2016). These interactions with OH, which follows its own seasonal distribution, allows CO to differentiate based ∗
Research mentored by Alicia Mullaley, Ph.D.
254
The Manhattan Scientist, Series B, Volume 5 (2018)
Misiakiewicz
on season. Therefore, carbon monoxide can be used as a trace gas due to its special variance and influences on atmospheric chemistry interactions. When the concentration of CO is determined, we can use the isotopic enrichments to determine its sources. In establishing the sources of CO going back to 8,000 years before present, we will better be able to understand the CO cycle in the atmosphere and how it can vary over time. Carbon monoxideâ&#x20AC;&#x2122;s major sources in the southern hemisphere include methane oxidation, biomass burning, and oxidation of non-methane hydrocarbons (NMHC). Biofuel burning and fossil fuel emissions are other major sources, more prominent in the Northern Hemisphere where most of the planetâ&#x20AC;&#x2122;s human population lives. Direct emission from the ocean is a minor sources that is negligible.
Figure 1. The isotopic signatures for sources of CO. The significance in contribution of CO to the atmosphere is based on circle size. It is identified that a largest sources for hydrocarbon-derived CO is biomass burning, and methane oxidation. Due to the enrichments of each source being extreme, it is easier to differentiate their contribution to CO in the atmosphere (adapted from Khalil (1993)).
As Antarctic ice forms, it traps air bubbles, creating an archive of past atmospheric compositions. These gases can be extracted and concentrations of specific molecules can be analyzed. The site which our ice cores will come from is the South Pole. Currently, the Antarctic ice sheet supplies ice cores for study and research. These meteoritic ice cores are stored, curated, and studied, primarily at The National Science Foundation Ice Core Facility in Colorado, U.S.A.
Misiakiewicz
The Manhattan Scientist, Series B, Volume 5 (2018)
255
This research is an extension of Mullaley’s (2016) atmospheric chemistry study. Through the use of a cryogenic vacuum extraction technique, concentrations of carbon monoxide and the isotope signatures of δ 13 C and δ 18 O in ice core bubbles were measured. This extraction is then followed by a continuous flow of isotope ratio mass spectrometry (cf-IRMS). We will be asking the question how did the sources and sinks of atmospheric trace gas CO vary over the last 8,000 years? This will be done through isotopic mass balance of δ 13 C and δ 18 O based off carbon monoxide concentration extracted from South Pole ice cores. A subsequent question will be how do the variations in CO gas identified in the past 8,000 compare to current day CO levels? This will be determined through the comparison of our calculated CO concentrations, compared to previous studies of more recent carbon monoxide measurements.
Background and previous studies Gases trapped in Antarctic ice are proxies for past atmospheric compositions of the Earth’s atmosphere. Records for greenhouse gases such as CO2 , N2 O, and CH4 concentrations and theorized sources go back thousands of years. Willi Dansgaard was the first person to develop the idea of ice being an archive. The Danish scientist discovered heavy oxygen isotopes in Greenland ice correlated to precipitation and temperature of the time they were trapped (University of Copenhagen, 2018). Dansgaard’s ice study was the foundation for understanding climate variability. Ice cores can come from both the Northern and Southern Hemispheres. However, the sources of CO will differ based on human population and difference in climate. Ice core analysis is a specific area of study within the scientific community. Mullaley (2016) conducted the procedure of the ice pole cores analyzed in this study in 2017 at the School of Atmospheric and Marine Science at Stony Brook University, NY. The data presented below are preliminary (Mullaley and Mak, 2018). The most prevalent research pertaining to South Pole ice core CO concentrations and isotopic enrichments takes the CO record back 650 years. The study, conducted in 2010, discovered that there were large variations in biomass burning in the Southern hemisphere, within the past 650 years. There was a significant decrease in [CO] found between the mid-1300s and the 1600s (Wang et al., 2010). The isotopic analysis for partitioning of CO reflected a decrease in CO derived from biomass burnings (Wang et al., 2010). This study was significant because of its effective use of isotopic enrichments to determine sources. There are various other proxies for atmospheric composition. Where this study analyzes CO concentrations, there have been studies based on methane records. The methane is oxidized to CO and its record is well documented. It is the dominant source of CO in the southern hemisphere (Mullaley, 2016). Due to methane’s long lifetime, it’s concentrations are relatively stable and easy to determine (Ferretti et al., 2005). This record will be used to help determine the residual sources of CO.
256
The Manhattan Scientist, Series B, Volume 5 (2018)
Misiakiewicz
Methods Dating ice cores Ice cores were provided from the South Pole in central Antarctica. Samples were drilled using a borehole, allowed to settle for one year, and then sent to the National Ice Core Laboratory in Denver, Colorado where they were cut and analyzed (Mullaley, 2016). Dating the ice cores from the South Pole is a complex process. The dating is dependent on climatic conditions such as accumulation rate and temperature. The South Pole is a tundra, with low accumulation rates, of 0.08m/yr., and average temperatures of -44â&#x2014;Ś C (Mullaley, 2016). Ice is formed by compression of the intermediate layer of snow, called the firn (Mullaley, 2016). Once the snow reaches a density of 0.8 to 0.83 m/cm3 , ice begins to form at what is known as the close off depth (Mullaley, 2016). The lower the amount of accumulation, the lesser amount of compression in the firn layer, causing the close off depth to be deeper. The South Pole climate of a tundra, with low accumulation, results in a close off depth of 120 meters (Mullaley, 2016). The intermediate firn layer allows for the intermixing of modern atmospheric gas with older gas due to spaces between snow particles. The exchange of gases continues until the air is trapped in the ice matrix at the close off depth (Mullaley, 2016). The gas age reflects the diffusion of gases across the firn layer, making the gas age younger than the ice that it is trapped in. This difference in age is called the delta age (â&#x2C6;&#x2020;age) value. Ice age is determined through a variety of methods that include the counting of annual snow layers. As the depth increases, the layers of thickness increase. When the layers become too thin to identify visually, they are differentiated chemically by impurity content and comFigure 2. The different layers in which accumulaposition (Mullaley, 2016). The scientific comtion settles, to eventually form ice. munity working with ice core data for the South Pole uses the same ice age scale. Based on the ice ages, we are able to calculate the gas age through the delta age. This calculation had an uncertainty of Âą500 yrs. To determine the age of the gas, we subtract the Average Gas Age from 1950, the year that represents present day in the scientific community, to determine a specific year (Fig. 3).
Misiakiewicz
The Manhattan Scientist, Series B, Volume 5 (2018)
257
Age (year before present)
10,000
Average Ice Age (yr BP 1950) Average Gas Age (yr)
5,000
0
100
200
300
400
500
600
700
Depth (m) Figure 3. Delta age value for individual ice core samples. The delta age is approximately 1000 years for the South Pole, but can vary at different sites based on accumulation rates.
Cryogenic vacuum extraction procedure Based on procedures by Wang et al. (2010), vacuum forces are applied to an extraction line connected to a mass spectrometer. Liquid nitrogen, at -200â&#x2014;Ś C, is applied to the line, cryogenically condensing out impurities within the gas sample such as water vapor, CO2 , and N2 O. Based on vapor pressures, CO will continue down the line, as the impurities are removed. In order for the mass spectrometer to determine a peak area for the isolated sample of CO, CO is converted to CO2 . The sample retains its isotopic signatures, even with the chemical oxidation. This oxidation is done by the SchuĚ&#x2C6;tze reagent, which acts as a catalyst to oxidize CO to CO2 . In order for the samples to be carried down the line, we introduce helium as a carrier gas. It is introduced at valve 11 (V11) which can be identified in Fig. 4. After the CO is oxidized to CO2 , the CO2 is isolated (frozen) by liquid nitrogen in a collection trap. For a live sample, the volume collected is determined by waiting for a collection time of approximately 2.5 minutes. The end of collection is signaled by a drop in line pressure to approximately 90 to 70 mBar. At that point, the liquid nitrogen is replaced by ethanol cooled to -70â&#x2014;Ś C so that CO derived CO2 continues down the line in order to be sent to the mass spectrometer, but any additional water vapor remains in the trap. The volume of gas that is collected in the trap is recorded by the computer program LABVIEW. An important aspect of the line is the MKS mass flow controller which controls the flow rate of the sample from areas of high pressure in the beginning on the line to areas of low pressure farther down the line. This pressure gradient allows for the sample to continue down the line without any backflush.
258
The Manhattan Scientist, Series B, Volume 5 (2018)
Misiakiewicz
Figure 4. Cryogenic vacuum extraction line. is shown. This image depicts the various instruments used for the isolation and measurement of the samples (Mullaley, 2016).
Diagnostic tests In order to determine the background signals of the line, and its sensitivity, we used several diagnostic tests. A system “blank” allows an quantification of CO2 breakthrough for traps 1 and 2, as well as diffusion across valves. This test bypasses the Schütze reagent in order to determine the efficiency of the line, without CO oxidation to CO2 . It only quantifies CO2 from breakthrough. The Schütze reagent used to chemically oxidize CO to CO2 , while preserving the original isotopic ratios, does have a background signal associated with impurities within the reagent, and diffusion of room air into the line. This signal needs to be quantified and then subtracted out of the ice sample. In order to quantify the Schütze blank signal, the line is opened to the flow of zero air, that contains no CO. A sample is collected for 2 to 3 minutes. Then, the mass spectrometer processes the peak area and isotopic ratios.
Misiakiewicz
The Manhattan Scientist, Series B, Volume 5 (2018)
259
In order to determine the sensitivity of the CO isotope analysis, a calibration gas of a known concentration is run through the system. The calibration gas used has a concentration of 85.9 ppb of CO. The nanoliters of CO processed is calculated by multiplying the calibration gas concentration by the volume processed (98.4 mL). This is then divided by the peak are of the calibration sample, already corrected by the â&#x20AC;&#x153;blankâ&#x20AC;? diagnostic. The sensitivity used for the eventual calculation of ice sample [CO], was 8.45 nL per CO molecule. Sensitivity of the line can vary over the years, not necessarily from the way the system reacts, but due to the response of the blank changing over the course of time (Fig. 5).
Sensitivity (nLCO/V-s)
10
5
0 42720
42730
42740
42750
42760
42770
42780
42790
Date Figure 5. Sensitivity of the line over the course of time in which the samples were processed
Bubble free water When processing an ice core, it is melted down, and the escaping gases that had been trapped are pumped into the line. In order for these gases to be pushed into the cryogenic vacuum extraction line, bubble free water is needed. We form pure, bubble free water by boiling distilled water. Bubble- free water is does not hold any diffused gases. In order to increase sample gas pressure for transfer, bubble free water is pumped into the ice container. As the ice sample is melted down, and a vacuum seal is applied, bubble-free water entering the container builds pressure allowing the efficient transfer of sample gas into the line. It is important that all free gases are removed from the bubble free water before it is introduced to the sample container, otherwise the sample gases will be compromised. Ice core sampling The ice core is processed by scraping down the sides with a medical grade scalpel. This allows for the surfaces to be smooth, so that there is less of a likelihood outside air can get lodged within cracks or davits. Diagnostic tests are then run on the line as explained in the above sections.
260
The Manhattan Scientist, Series B, Volume 5 (2018)
Misiakiewicz
Schütze blanks are run until consistent (usually five blanks are needed). Calibration gas is run through twice, and then two more blanks are processes. After the system diagnostic tests are done, the ice core is introduced to the line. A vacuum seal is created in the ice container via a rotary vane pump and a turbo-molecular pump for five minutes. The ice sample is flushed with CO free air five times, evacuated, and then flushed again. This removes any gases clinging to the surface of the ice. The ice is melted down, and escaping gases are transferred to the line via pressure created by bubble free water, since the ice container is under a vacuum seal. Once the ice is melted, the ice container is opened, allowing the expansion of gases and a flow into the line through the pressure applied form bubble free water. Collection for the sample is then conducted, and results for volume of the sample and peak area are recorded.
Results and Discussion Carbon monoxide concentrations Following the processing of the CO from the South Pole ice cores, the CO concentrations are calculated by determining the nL of CO, corrected for sensitivity and background signal of the line, and divided by the volume of the sample. nL CO Step (1) nL CO = Sensitivity of line ∗ (PA sample Vs) Vs Step (2)
Concentrations of CO (mL CO) =
1000 ∗ nL CO Volume (mL)
[CO] (ppbv per liter)
100
80
60
40
20
-6000
-5000
-4000
-3000
-2000
-1000
0
1000
2000
3000
Year (AD) Figure 6. Various CO concentrations calculated from Dr. Mullaley and Prof. J. E. Mak’s raw data
The modern-day normal range for carbon monoxide levels are between 35 ppb to 70 ppbv. The measured samples, dating back almost 8,000 years, have an average of 55 ppbv. Although this is higher than normal modern day ranges, the data points demonstrate a level of consistency. The climate 8,000 years ago, with little human influence, will vary compared to what we identify as modern levels.
Misiakiewicz
The Manhattan Scientist, Series B, Volume 5 (2018)
261
Comparison to present abundance of carbon monoxide Carbon monoxide is influential on the interactions of molecules in the atmosphere. As a sink for the hydroxyl radical (OH), CO concentrations influence the lifetimes of greenhouse gases through its reactions with OH. In identifying the trends of CO concentrations as compared to measurements closer to present day, we are able to distinguish variation in the CO cycle. Compared to other data sets, the CO concentrations from this study, shown in dark grey in Figure 3.2, are consistent and at a higher average. This other data sets, the CO concentrations from this study, shown in dark grey in Fig. 7, are consistent deviation allows us to hypothesize a differentiation in the CO cycle, and even a change in major sources. and at a higher average. This deviation allows us to hypothesize a differentiation in the CO cycle, and even a change in major sources. 120
South Pole (Wang et al., 2010) 100
D47 (Wang et al., 2010)
[CO] (ppbv)
Vostok (Haan and Raynaud, 1998) 80
WAIS D (Mullaley, 2016) D47 (Haan et al., 1996)
60
Assonov et al., 2007 Modern CO Scott Base
40
South Pole (this study)
20
-6000
-5000
-4000
-3000
-2000
-1000
0
1000
2000
Year (AD) Figure 7. CO concentrations calculated in this study are plotted against CO concentrations from previous studies. The CO concentrations for this study are shown in dark grey.
Determination of sources through isotopic enrichments Isotopes are atoms of the same element with different number of neutrons, causing a difference in molecular mass. The isotopes addressed in this study are radioactively stable. Different isotopes have different sources or pools such as the oceans, rocks, or the atmosphere. Isotopes of different masses have different properties such as melting point and viscosity. Lighter isotopes for instance, have weaker bonds, so it takes less activation energy to react. The first step in calculating the isotopic enrichment is correcting for gravitational fractionation, or fractionation settling. This implies that isotopes within the gas samples settle at different rates due to mass differences. To understand the relationship between the sources of CO and the CO cycle, we have to quantify the major sources of CO according to their influences on concentration. The major sources of
262
The Manhattan Scientist, Series B, Volume 5 (2018)
Misiakiewicz
CO in the southern hemisphere that were analyzed are methane oxidation, non-methane hydrocarbon oxidation, and biomass burning. These signatures, in comparison to each other, have extreme isotopic ratios which allows for more effectiveness in differentiating sources and their contribution. In order to determine the contribution of each major sources for the calculated CO concentrations, we use an isotopic mass balance. Calculating the concentration contribution of each source based off the isotopic enrichments, sums to the total concentration of CO. The stable isotopes analyzed in this study were δ 13 C and δ 18 O. Below, the equations used illustrate the isotopic mass balances used for δ 13 C . The calculations for δ 18 O are similar, except for an offset correction of 1.78 comes from 141 ppb CO Keeling plot model which is multiplied by 2. This multiplication is due to the additional oxygen in CO derived CO2 . Overall equation (Mullaley, 2016) The contribution of CO from each sources is added to form the total concentration of CO, [COA ] + [COB ] + [COC ] + [COD ] + [COE ] + [COF ] + [COG ] = [COT ] Mass isotopic balance equation (Mullaley, 2016) G X i=A
[δ 13 Ci ] × [COi ] = [δ 13 CT ] × [COT ]
where, δ 13 C is the δ 13 C observed, multiplied by the peak area of the sample, (δ 13 C observed)(PA observed) = (δ 13 C SCH)(PA SCH)+(δ 13 C Sample)(PA Sample). The methane record is very well known (Brook, 2009). As methane oxidation is a source of CO, we are able to determine the methane contribution. This study used methane concentrations from Brook (2009). After calculating the CO concentration derived from methane oxidation, we are left with residual CO, δ 13 C res =
[CO] ∗ (δ 13 C)corr − (δ 13 CH4)ICE ∗ [(CO)CH4 ] [CO residual] δ 18 O res =
δ 18 O CO ∗ [CO] . [CO residual]
We can then deduct the CO concentrations derived from biomass burning and NMHC oxidation based off of modeled number from 2005. [δ 18 O res]*[CO res] = [δ 18 O NMHC]*[CO NMHC] +[δ 18 O biomass burning]*[CO biomass burning]. δ 18 O values for NMHC and BB were found in table 4.1 and 4.2 of Mullaley (2016). The relative sources of CO are illustrated in Fig. 8. It is important to note that the methane contribution in minimal. I hypothesize that more modern CO concentrations will have a larger
Misiakiewicz
The Manhattan Scientist, Series B, Volume 5 (2018)
263
methane derived CO ppb (modeled)
[CO] biomass burning (minimum)
[CO] NMHC (minimum)
1871
1239
1063
909
573
390
-44
-236
-997
-1179
-1352
-1935
-2135
-2311
-2460
-3369
-4161
-4291
-5517
70 60 50 40 30 20 10 0 -10 -20
-5852
[CO] source contribution (ppb)
methane contribution in proportion. This is due to methane being produced from farming and agricultural practices. Since human influence was so minimal for these CO concentrations, it is predicted that methane contribution in also minimal. The other two sources, biomass burning and NMHC oxidation have an inverse relationship. This makes sense, since they are in proportion to one another.
Figure 8. Relative Sources of CO (minimum). Shown are the ratios of the three major sources of CO in the southern hemisphere concentrated on in this study. The source contribution based off residual CO calculations is plotted again the gas age.
Conclusions Interactions of chemicals in the atmosphere have direct influence on living things, and living things have direct influence on the chemical interactions in the atmosphere. It is a cycle. To better understand how the atmosphere will develop with anthropogenic influence, it is important to understand the basic, natural cycles. As stated earlier, CO affects chemical interactions due to its position as a sink for the hydroxyl radical. The hydroxyl radical directly influences the lifetimes of many greenhouse gases, whose concentrations are more prevalent to climate change than ever before. Understanding the carbon monoxide cycle and how its sources influence itâ&#x20AC;&#x2122;s concentrations will be an important step in predicting climate change. The concentrations looked at in this study can be interpreted as the CO cycle without human influences. Identifying the variation in concentration from 8,000 years ago to present day gives us an idea of how influential humans can be on atmospheric composition. A future area in this niche of research is the isotopic analysis of more modern day CO measurements. A comparison of the southern and northern hemispheres and their sources would be interesting to look into. We know
264
The Manhattan Scientist, Series B, Volume 5 (2018)
Misiakiewicz
that fossil fuels having a larger influence in the northern hemisphere, but there is a question of if they are prevalent in the southern hemisphere modern day. This research was exploratory. Calculating and analyzing CO concentrations and sources is a complex process that takes into account many variables. As scientists, any information discovered is significant, and attempting to have the lowest degree of uncertainty for data is a priority. Analyzing gases from so long ago opens up the doors of opportunity for different analysis of anthropogenic influences, not only in atmospheric chemistry, but also how atmospheric chemistry has implication on climate, and how it can dictate future events.
Acknowledgment This work was supported by the School of Science Summer Research Scholars Program. The experiments were done at Stony Brook University. The author is indebted to Dr. Alicia Mullaley and Dr. John E. Mak for the guidance throughout this work.
References Ferretti, D. F., Miller, J. B., White, J. W. C., Etheridge, D. M., Lassey, K. R., Lowe, D. C., MacFarling Meure, C. M., Dreier, M. F., Trudinger, C. M., van Ommen,T. D., and Langenfelds, R. L. Unexpected Changes to the Global Methane Budget over the Past 2000 Years. Science, 309. doi: 10.1126/science.1115193 (2005). Khalil, M. A. K. (Ed.), Atmospheric Methane: Sources, Sinks, and Role in Global Change, Springer-Verlag, Berlin (1993). Mullaley, A. R. Reconstruction of Atmospheric [CO] and Stable CO Isotopes δ 13 C and δ 18 O Over the Last 250 Years (doctoral dissertation). Stony Brook University (2016). Mullaley A. R. and Mak J. E., personal communication (2018). University of Copenhagen (n.d.). The history of Danish ice core science. Retrieved from http://www.iceandclimate.nbi.ku.dk/about centre/history/ (2018). Wang, Z., Chappellaz, J., Park, K., and Mak, J. Large Variations in Southern Hemisphere Biomass Burning During the Last 650 Years. Science, 330, 1663-1666. doi:10.1126/science.1197257 (2010).
Analyzing the concentration of atmospheric CO derived from biomass burning during 1700-1800 AD Peter Parlato∗ Department of Chemistry and Biochemistry, Manhattan College Abstract. Carbon monoxide (CO) is a trace gas that influences atmospheric chemistry interactions. Reconstructing variations in biomass burning is important because biomass burning plays a key role within the earth-atmosphere system. The data presented in this research gives a better understanding of the biomass burning derived CO trend from the past. Also, this data can be used in models as additional constraints.
Introduction This research project analyzed the gas trapped in bubbles from ice cores drilled from the West Antarctic Ice Sheet Divide (WAIS divide), Antarctica, to observe the effects of biomass burning on the atmosphere from 1700-1800 AD. Biomass burning is the combustion of any living things; it can be naturally occurring or the result of human activity. This project was mainly concerned with the emission of carbon monoxide (CO) from biomass burning. CO is significant because it can alter the oxidation state of the atmosphere by changing the concentration of the most important oxidant, the hydroxyl radical (OH). More specifically, CO in the atmosphere can react with hydroxyl (OH), thus lowering its concentration. This allows other important trace gases like methane, a major greenhouse gas, to build up in the atmosphere because of the decrease of OH for it to react away with. Greenhouse gases can cause significant variations in the climate of the earth. For example, a warmer climate has the potential to change water supplies, alter the ability to grow crops, and cause a rise in sea level. In addition, reconstructing variations in biomass burning, which is a major source of CO, is of interest because biomass burning plays a key role within the earth-atmosphere system. Climate impacts the biomass burning pattern, as suggested by the potential correlation between burning and climatic variations such as temperature and precipitation trends. Throughout time, the snow in Antarctica compresses to create layers of ice which trap in gases from the atmosphere in the form of air bubbles. This means that ice cores are a physical record of the earth’s atmospheric chemical makeup throughout time. By analyzing the gas trapped in this ice core sample, more about the past can be better understood. By better understanding the past, the future of earth’s climate can be better predicted. The stable isotopic signatures of CO help to determine its source (Fig. 1). Since the timeframe being analyzed is prior to the industrial revolution, the three main sources of CO are methane oxidation, non-methane hydrocarbon oxidation, and biomass burning. Fossil fuel burning is not yet a factor in the southern hemisphere during this time and the oceans are a very small source, ∗
Research mentored by Alicia Mullaley, Ph.D.
266
The Manhattan Scientist, Series B, Volume 5 (2018)
Parlato
so they are not factored in. The methane derived CO will be determined first to find the biomass burning derived CO.
Figure 1. Isotopic signatures and sources of CO [1]
Objectives The objective of this project was to analyze ice core samples from WAIS divide. This particular project only be looked at samples from WAIS divide which covers â&#x2C6;ź1700-1950 AD. Also, only a few years were analyzed from this time frame since the procedure for this work takes a significant amount of time. What was investigated during this time period was how the CO concentration from biomass burning changed in the southern hemisphere. I hypothesized that the concentration of atmospheric CO has increased from 1700 â&#x2C6;&#x2019; 1950, being that there is an increased presence of human activity. I also hypothesized that there was a great increase at the start of the industrial age â&#x2C6;ź1800 AD. The next objective was to see if there is a correlation between the concentration of CO and methane. I hypothesized that there is a direct correlation between CO and methane: as CO increases so does methane. If the outcome is that methane is of significantly higher concentration than the ratio would suggest, that could mean that there is also another source of methane contributing to the result, requiring further study.
Analytical Methods The samples were analyzed with cryogenic vacuum extraction and mass spectrometry. A custom-built cryogenic vacuum extraction line was used to extract the CO from the ice core according to Mullaley [1]. The principles are discussed in detail in [2] but essentially, the extraction
Parlato
The Manhattan Scientist, Series B, Volume 5 (2018)
267
Figure 2. The interrelationship among CO, CH4 , and OH. The mean abundance of OH radical is largely determined by the abundance of CO and CH4 , because these two gases are the largest sink for OH [1]
line has a water trap and two additional clean-up traps added to remove water vapor, N2 O, and CO2 which would interfere with the accuracy of the results. In this procedure, the ice was melted so that the gases trapped inside can be released. Once the H2 O, N2 O, and CO2 are removed from the sample gas, the CO was oxidized to CO2 by I2 O5 on a silica gel support, so that it can be analyzed by mass spectrometry. After the CO-derived CO2 was collected it was loaded into a gas chromatography column to separate the CO2 from any possible trace amounts of N2 O. Directly from the GC column the gas entered a mass spectrometer. For this project, a Finnegan MAT 253 Isotope Ratio Mass Spectrometer was used. The peak area of a reference gas of known concentration, which was provided by NOAA, was then compared to the peak area from the experiment. From this the concentration of CO in the sample was determined. This process was then repeated on different depths of the ice core. The deeper part of the core has the sample dating the farthest back (∼1700 AD) and the top of the core has the most recent sample (∼1800 AD).
Procedure An updated and improved version of a cryogenic extraction (Fig. 3) system previously used in similar studies will be utilized to extract CO. In addition, gas chromatography and mass spectrometry are used in the procedure. Before any ice can be processed, however, a number of diagnostic experiments must be performed on the system to ensure that accuracy and precision is acceptable when running the actual experiment. Bypass blank The bypass blank procedure is used to show how efficiently the liquid nitrogen cleanup traps are working and to make sure there are no leaks in the line. Zero air, i.e. air that is free of any hydrocarbons, is running through the system when doing a bypass blank. The Schütze reagent reactor is open when doing the bypass blank as well to make sure CO is being oxidized to CO2 .
268
The Manhattan Scientist, Series B, Volume 5 (2018)
Parlato
Figure 3. Schematic of the system used for CO concentration, δ 13 C and δ 18 O analysis of ice cores [1].
Schütze blank The Schütze reagent quantitatively converts CO to CO2 at room temperature. Also, the Schütze reagent preserves the original oxygen isotopic ratio of the CO which is very important to the analysis. There is a background signal associated with the Schütze reagent that is likely due to impurities or diffusion of air across the O-rings located in the Schütze reagent vessel [3]. To quantify the Schütze signal CO free air is processed through the extraction line. The CO values for the Schütze blank is usually ∼5 ppb. This value needs to be taken out from the sample reading to get a better accuracy. Schütze blanks were run before each sample was processed. Once the line was flushed through the bypass, the pump valve was closed and so was the bypass. The gas then flowed through the Schütze reactor. Next, Isodat 2.0, the program used to control the pre-concentration unit, was started. Once the downline pressure is above 1100 mb valve
Parlato
The Manhattan Scientist, Series B, Volume 5 (2018)
269
17 is closed and valve 14 is opened. Valve 14 is used for the sample and allows gas to go to the pre-concentration unit and ultimately the mass spectrometer. When the gas reaches the final trap, it first goes to a gas chromatograph and then the mass spectrometer. There are three traps for the gas, once trap 3 (T3) is raised, the gas is able to travel to the mass spectrometer and measurements can commence. At this time, Isodat 3.0, the program that controls the MAT 253, is started. This program that takes the measurements was also used for the calibration gas and the actual sample. Calibration gas Calibration gas is used to confirm the precision of the CO isotope system. The calibration gas is extremely pure CO, of a known concentration and isotopic composition. Calibration gas runs determine the accuracy and reproducibility of the system. It is important that the results of the isotope ratio of calibration gas are consistent. A high precision in this test will mean that the system is running well. This is important for the accuracy of the results during the processing of the sample. Calibration gas is processed according to the procedure described above for SchuĚ&#x2C6;tze blanks. Samples The ice core samples from Antarctica come from the National Ice Core Laboratory in Colorado, which cut and distribute ice cores across the country for research. Ice core samples are first cut in half with a band saw and scraped with a medical grade scalpel to clean and smooth the surface. After the ice is trimmed it is weighed and the mass is recorded. The ice, now ready to undergo extraction, is placed in a specially made glass container and attached to the line. The ice and container are then evacuated and flushed with zero air three times before melting. At this point the system is prepared to run, but one additional item is needed for processing the ice sample: high purity, bubble-free water. The purpose of this is to fill the residual volume in the ice container, which increases the gas sample pressure for a more efficient transfer into the extraction line. To make bubble-free water, a specialized glass container was used. The container was filled with deionized water and placed on a hot plate; it was covered with a lid with a valve. Boiling the water on the hot plate allows the gases trapped in the water to escape. The lid prevents water from boiling off and it prevents any room air from coming into contact with the water. Once the bubble-free water was prepared the container was attached to the sample container. The sample container is fitted with an O-ring and a horseshoe clamp and placed in a liquid nitrogen alcohol bath held at -40â&#x2014;Ś C on a jack stand. The procedure for the sample is now able to start; it is similar to the procedure for the SchuĚ&#x2C6;tze blank but there are key differences: First, valve 4, valve 5, and the zero-air generator valve are closed. Next, the pump evacuates the upline of the system. Once the line is evacuated and the sample is secure, the sample is evacuated using the rough and turbo pump. Valve 2 is opened, valve 5 is closed, and the regulator
270
The Manhattan Scientist, Series B, Volume 5 (2018)
Parlato
is also closed. After 5 minutes, the turbo is closed and the pump valve, valve 3, is closed. The sample now needs to be flushed with CO free air five times. First, valve 3 for the rough pump is opened to flush and then closed again. Next, the zero-air valve is opened until the pressure reaches 800 mbar and then closed again. After this process is repeated five times, the turbo pump is opened for one minute. Lastly, valve 1 and valve 3 are closed and the pump valve is put back to neutral. The sample now needs to be evacuated again and flushed three times with a working gas standard (calibration gas). The ice container was filled to ∼1050 hPa with calibration gas and valve 2 was closed. The point of this was that calibration gas test needs to be run before extraction for the purpose of accuracy. The previously stated collection procedure for calibration gas was done during this time. After the calibration gas on ice test was done, the ice container was evacuated and flushed with zero-air three times. Next, the container was evacuated with the turbo pump for 20 minutes, the ice container was closed, and valves 1 and 3 were closed. After that a system blank is done with zero air to flush the line. After the line has been evacuated, it is ready to process the sample. First, valve 5 and the zero-air valve were closed. Valve 3 was then opened and the rough and turbo pumps were turned on. Next, valve 5 was opened and the pump valve was closed. Now the ice container is evacuated and immersed in a hot water bath to melt and release the gas trapped in its bubbles. During this melting period, two Schütze blanks were run before opening the valve to the ice container. After this is done and the ice has melted the container valve was finally opened, allowing the gas to enter the upline. As the gas flows from up line to down line, any condensable gas species, e.g. water vapor, get frozen in the coil traps which are keep cool by pure liquid nitrogen in metal dewars. There are four traps on the line, excluding the two microfocus cryogenic traps.
Data Analysis Concentration analysis Finding the total concentration of the CO in the sample was the first part of the analysis. The total concentration of CO is very small and measured in parts per billion by volume (ppbv). First, a system sensitivity was calculated using data from the calibration gas and is multiplied to the CO concentration. Sensitivity is essential the response of the instrument and is calculated by dividing nanoliters of CO by the peak area, in Volt-seconds (Vs), of the sample which is blank corrected. Next, the average of the peak area found from the Schütze is taken. The average peak area of the Schütze is subtracted from the peak area of the sample. This value is multiplied to the sensitivity which gives the nanoliters (nL) of CO. This value is then used to find the concentration in ppbv by multiplying it by 1000 and dividing by the volume in mL, [CO] =
1000*nL of CO . volume (in mL)
Parlato
The Manhattan Scientist, Series B, Volume 5 (2018)
271
CO Concentration Measured peak area, volume, and sensitivity were used to calculate [CO]. Compared to the total volume of gas, the volume of CO is very small, measured in nL. The CO concentration data during this time were mostly consistent with an average of ∼47 ppbv (±5 ppbv). 80
[CO] ppbv
60 40 20 0 1700
1750
1800
Year (gas age) (AD) Figure 4. CO concentration (±ppbv). Data from A.R. Mullaley, personal communication (2018).
CH4 derived CO The [CO] derived from CH4 was determined from the past CH4 concentration data coupled with modeled data of modern day CH4 derived CO. It is expected that modern day methane concentrations would have a larger contribution to CO. From these data, methane derived CO is shown to be a steady source (∼10 ppbv). CH4 derived CO (ppbv)
80 60
[CO] CH₄ derived CO
40 20 0 1700
1750
1800
Year (gas age) (AD) Figure 5. CH4 derived CO (full circles) and CO concentration (hollow circles).
Isotope analysis To find what percentage of the total concentration of CO came from biomass burning, an isotope mass balance was used to quantify the different source partitioning [4, 5]. The following
272
The Manhattan Scientist, Series B, Volume 5 (2018)
Parlato
equations were used: [COA ] + [COB ] + [COC ] = [COT ] [δ 18 OA ][COA ] + [δ 18 OB ][COB ] + [δ 18 OC ][COC ] = [δ 18 OT ][COT ] [δ 18 Or ][COr ] = [δ 18 ONMHC ][CONMHC ] + [δ 18 OBB ][COBB ], where A, B and C represent the three major CO sources in the Southern Hemisphere. The three sources are biomass burning, methane oxidation, and non-methane hydrocarbon (NMHC) oxidation. These sources represent ∼95% of the overall sources of CO in pre-industrial times. T designates the total CO and represents total concentration of CO found in the ice core samples (Figs. 4 and 5). The ratio of these sources can be found with the isotope data. The δ 18 O data is used for the isotope mass balance model since δ 13 C signatures have a high uncertainty. For biomass burning, δ 13 C is dependent upon the burned C3/C4 plants ratio and studies report this varies over time [3]. For methane derived CO, the abundance of methane and can be calculated with the equation [CO]CH4 =
[CO]CH4 present day × historic[CH4 ]. [CH4 ] present day
Once the methane derived CO is calculated, δ 18 O is assigned (Fig. 1) and solved to find the biomass burning derived CO as well as the NMHC derived CO. Fig. 6 is a graph of the residual δ 13 C and δ 18 O needed to find the sources. The residual is the methane contribution taken out since
Residual
20
0
δ¹³C δ¹⁸O
-20
-40 1700
1750
1800
Year (AD)
Figure 6. Residual δ 13 C (open circles) and δ 18 O ( full circles) after correction.
it is known from the robust concentration and isotope record available today. Biomass burning (BB) derived CO To determine the amount of CO that came from biomass burning, the [COnmhc ] was determined from δ 18 O isotope values, and then subtracted from the residual [CO]. The trend for biomass burning derived CO also does not have any significant structure, but the data do indicate some years where the [COBB ] seems to be a more significant source. There is a higher uncertainty associated with this data set due to the δ 13 C signatures.
Parlato
The Manhattan Scientist, Series B, Volume 5 (2018)
273
[COBB] (ppbv)
20
10
0 1700
1750
1800
Year (gas age) (AD) Figure 7. Concentration of biomass burning derived CO
Conclusions The ice analyzed in this project dates back to a gas age prior to 1800. CO concentration data during this time seems mostly consistent with an average of ∼47 ppbv (±5 ppbv). The trend for biomass burning derived CO also does not have any significant structure, but the data does indicate some years where the [COBB ] seems to be a more significant source. From these data methane derived CO is shown to be a steady source (∼10 ppbv). However, it is expected that modern day methane concentrations would have a larger contribution to CO. These data are important because they have many uses in helping us further understand the history of Earth’s atmosphere. Specifically, one important application is that the CO concentration and isotope results can be used as constraints in climate-chemistry models that attempt to simulate an atmosphere that is significantly different from today’s.
Acknowledgments This work was supported by the School of Science Summer Research Scholars Program. The author would like to Dr. Alicia Mullaley, Manhattan College School of Science, and Professor J. E. Mak, School of Marine and Atmospheric Science, Stony Brook University, for their mentorship.
References [1] Mullaley, A. R. Reconstruction of Atmospheric [CO] and Stable CO Isotopes δ 13 C and δ 18 O Over the Last 250 Years (doctoral dissertation). Stony Brook University, 2016. [2] Brenninkmeijer, C. A. M., Measurement of the abundance of 14 CO in the atmosphere and the 13 12 C/ C and 18 O/16 O ratio of atmospheric CO with applications in New Zealand and Antarctica, Journal of Geophysical Research-Atmospheres, 98(D6), 10595-10614, 1993.
274
The Manhattan Scientist, Series B, Volume 5 (2018)
Parlato
[3] Huang Y., F. A. Street-Perrott, S. E. Metcalfe, M. Brenner, M. Moreland, and K. H. Freeman, Climate change as the dominant control on glacial-interglacial variations in C3 and C4 plant abundance. Science, 293, 1647, 2001. [4] Mak J. E. and Kra, G., The isotopic composition of carbon monoxide at Montauk Point, Long Island, Chemosphere-Global Change Science, 1, 205-218, 1999. [5] Wang, Z., Chappellaz, J., Park, K.H., and Mak, J.E., Large Variations in Southern Hemisphere Biomass Burning during the last 650 years, Science, 30, 1663-1666, 2010.
Observable relics of a simple harmonic universe Peter Gilmartin∗ Department of Physics, Manhattan College Abstract. We take models of the Simple Harmonic Universe and its possible relation to our universe and investigate what observable signals may be detected from its through the use of the software packages CLASS and Monte Python. The components of this Simple Harmonic Universe are positive curvature, a negative cosmological constant, and one or more exotic matter sources with an equation of state w = p/ρ between -1 and -1/3. We simulate the effects of these signals on the Cosmic Microwave Background. We find that any effects manifesting from curvature will dominate the changes to the CMB, while effects from the other signals are subdominant to those from curvature. We also find that small amounts of positive curvature that are unmeasurable in the late universe can still be detected by the effects to the primordial power spectrum. These effects offer an explanation to the quadrupole anomaly.
Introduction The Simple Harmonic Universe (SHU) is a class of models studied in refs. [1] and [2]. These models were created to investigate the possibility of having a universe that could ‘bounce,’ supported by solutions of general relativity that avoided singularities. The Simple Harmonic Universe created from this project requires various new energy sectors. These include positive curvature, which is manifested as negative energy, and a negative cosmological constant. These would be responsible for contracting the universe. There is also the addition of matter with an equation of state w = p/ρ between −1 and −1/3. This matter can be manifested, for example, as a sort of honeycomb structure permeating space or as a cobweb of strings, depending on the value of w. All types of matter and energy sources an expansion or contraction as governed by the Friedmann Equation. Combining this equation with Euler’s fluid equations allows us to describe how these matter-energy sources dilute as they expand. This also allows us not only to determine if the universe as a whole is expanding or contracting, it also allow us to determine how the universe evolves over a period of time. 2 K ȧ 8πGN ρ0 eff 2 =H = −Λ + 2 3(1+w) a 3 a a
The first term, aȧ , is the Hubble parameter, H, which describes the rate of expansion for the universe. The terms on the other side of the equation describe various matter sources in the universe. The last term, Kaeff 2 , represents curvature in the universe together with a string-like exotic matter with w = -1/3. The terms within the parenthesis are more standard matter sources. Λ represents dark energy, while the first term represents any other type of matter. The most important variable in that term is the ‘w’ in the denominator. This is the equation of state, which determines how a matter source dilutes. For example, Λ has a w of -1, so it does not dilute at all. ∗
Research mentored by Bart Horn, Ph.D.
276
The Manhattan Scientist, Series B, Volume 5 (2018)
Gilmartin
There are two possible ways a SHU could be related to our current universe. One possibility is the occurrence of a false vacuum decay [3], resulting in the matter sectors of the SHU reorganizing into the sectors we see in our universe. In this case, most of the exotic matter sources would be transformed into energy and dumped into the cosmological constant, canceling out the negative energy of positive curvature. This would possibly leave behind remnants of these exotic sources as relics. The second possibly is that our universe is contained within the evolution of the SHU. Our current epoch of expansion would merely be one phase of the SHU, where the positive curvature would be diluted to be unnoticeable in the present day. Either method would leave some form of observable relic, which may include positive curvature, either with or without the extra matter sources. We observe the effects of these relics by investigating the Cosmic Microwave Background (CMB) radiation. The CMB (Fig. 1) is a resource that can be drawn upon to investigate the origins of the universe. Whenever we look into the sky we are looking into the past, a result of the finite speed of light. With precise telescopes, we can view the history of the cosmos; this allows us to observe the conditions of the early stages of the universeâ&#x20AC;&#x2122;s evolution.
Figure 1. Cosmic Microwave Background (CMB) radiation map from Planck Image Gallery [4]
The early universe was too energetic to allow neutral atoms to form, so all of existence was a soup of particles and plasma. These blocked all photons, so the universe was opaque. Time passed, and eventually the universe cooled down enough to allow particles to bond and form atoms. Helium formed first, followed shortly by hydrogen. This moment is called recombination. At this moment the universe became transparent, meaning that the CMB is a snapshot of the universe at recombination. The hot and cold spots, modeled in red and blue, are theorized to have been generated by quantum fluctuations during the era of primordial inflation. There have been specialized
Gilmartin
The Manhattan Scientist, Series B, Volume 5 (2018)
277
missions, such as the Planck space telescope, to record and map the distribution of this radiation, leading us into a new era of precision cosmology. How the temperature fluctuations in the CMB are distributed tells us about the expansion history of the universe. These data are graphed as a power spectrum of the relative intensity of multipole moments. This spectrum holds a great deal of information. The primordial contribution to the spectrum is generated by inflation in the very early universe, which also sets the initial conditions for the acoustic peaks seen in the CMB (Fig. 2). These acoustic peaks begin to evolve from these initial conditions as they reenter the observer horizon, with the evolution being guided by when they reentered the horizon. The higher multipoles came back in first. This is why they decrease in intensity: they came back into the horizon early and the spectrum was degraded by photon diffusion. The lower multipoles of the spectrum were originally generated during the primordial era and were quickly thrown beyond the observer horizon. They were also the last to come back within the horizon.
intensity of power spectrum in ÎźK2
The acoustic peaks we see are from sound propagation in the early universe. These peaks are correlated with matter-energy sources. Curvature and dark energy affect the size and orientation of the first peak, dark matter does the same for the second peak, and baryonic matter for the third peak [5]. It is these peaks that we are interested in. They give us information about the matter-energy composition and evolution of the universe. 6000 5000 4000 3000 2000 1000 0 0
500
1000
1500
2000
2500
multipole moment
Figure 2. Standard CMB power spectrum
Methods The acoustic peaks of the CMB power spectrum can be simulated by the Cosmic Linear Anisotropy Solving System (CLASS) software [6]. Another option was using the CAMB software, but CLASS operates in C++ and Python. This makes the software much easier to understand and modify. CAMB runs faster due to being built with Fortran, but that also makes it more difficult to modify. CLASS is also compatible with the MontePython [7, 8] program, which allows us to do further quantitative estimates of parameter likelihoods by running Monte Carlo simulations.
278
The Manhattan Scientist, Series B, Volume 5 (2018)
Gilmartin
CLASS is able to generate new power spectra based on data it is given, which can then be compared to the observed power spectrum. This allows us to test how new types of matter and energy would affect the CMB peaks. Fig. 2 shows what CLASS outputs without any changes to the currently accepted values of the cosmological parameters. This is the control of our investigation. We first started by making simple changes to the parameters to gain an understanding of how the code operated and to build up our intuition of how these various matter-energy sources affect the universe. We ran a series of curvature variations (Fig. 3) where â&#x201E;Śk represents the percentage of the energy density today that consists of curvature. This was of interest because little research has been done into positive curvature. Most studies have focused on negative curvature or flat geometry.
Figure 3. Power spectra for various curvature values
The series of curves in Fig. 3 shows that changing the curvature does not change the intensities of the peaks but, instead, shifts the location of the peaks due to the large-scale lensing effects in the late-time universe. We also used CLASS to model how additions of new energy sectors would affect the CMB. We found that adding both positive curvature and a new matter source with w = â&#x2C6;&#x2019;1/3 to balance out the total energetic contributions resulted in very little change in the TT spectrum, relative to the effects of curvature alone. There is a more noticeable effect on the polarization spectra (Fig. 4), which may be due to them being more sensitive to changes in the evolutionary history of the universe. We also added in a new matter source. The CLASS software had a built-in module for adding new matter sources, so we used it to create our own exotic matter. We also modified the code to allow for multiple new fluid sources instead of just one. This also allowed us to modify the equation of state for the dark energy sector away from one, shown in Fig. 5. We found that this tends to make the constraints on curvature slightly stronger. We have not yet done a comprehensive examination on the effects of multiple exotic matter sectors. We modified the primordial section
Gilmartin
The Manhattan Scientist, Series B, Volume 5 (2018) 50
5000
Curvature Curvature and String Matter
4000 3000 2000 1000
150
EE Power Spectrum 40
Curvature Curvature and String Matter
30
20
10
intensity of power spectrum in μK2
TT Power Spectrum
intensity of power spectrum in μK2
intensity of power spectrum in μK2
6000
0
0 0
500
1000
1500
2000
0
500
1000
1500
2000
multipole moment
multipole moment
279 TE Power Spectrum
100
Curvature Curvature and String Matter
50 0 -50 -100 -150 0
500
1000
multipole moment
1500
Figure 4. TT, EE, and TE power spectra with positive curvature and string matter intensity of power spectrum in μK2
6000
w = - 0.80 5000
w = - 0.85 w = - 0.90
4000
w = - 0.95
3000 2000 1000 0
0
200
400
600
800
1000
multipole moment Figure 5. Power spectrum with variations to the equation of state for dark energy
of the code to mimic a small amount of positive curvature in the early universe. Fig. 6 shows the first fifty multipoles of the TT spectrum, which are generated at the earliest epoch of inflation we intensity of power spectrum in μK2
2500
Planck Satellite Data Best Fit Prediction for Standard Cosmological Model Modified Spectrum with k_0 set to 0.000194
2000
1500
1000
500
0
0
5
10
15
20
25
30
35
40
multipole moment Figure 6. Original and modified primordial spectrum [10]
45
50
2000
280
The Manhattan Scientist, Series B, Volume 5 (2018)
Gilmartin
can observe. The original equation for calculating this spectrum is shown below, along with the modified version. ns −1 2 ns −1+12as ln( kk ) 1 k k2 k → P (k) = As P (k) = As k1 k1 k 2 + k02
We had to rename as to k0 in the second equation for technical reasons.
As said earlier, positive curvature in the universe is manifested as negative energy. Negative energy from this curvature suppresses the generation of fluctuations in the early universe (see [9] for related work on similar effects). As stated before, it is these fluctuations that generate the primordial spectrum. The addition of this positive curvature should suppress the primordial spectrum, and that is what we see as shown in Fig. 6.
Results The largest experimental constraints on the Simple Harmonic Universe are from the positive curvature component. The Planck satellite data and Monte Python constrain the levels of positive curvature to a maximum of half a percent of the total energy of the universe in the present epoch. The effects of the exotic matter don’t affect this conclusion, as the exotic matter either generates minute changes to the spectrum when compared to the curvature or serves to strengthen the constraints to the value of curvature. However, as shown in Fig. 6, even if the curvature is too small to detect now, we can observe its effect on the primordial spectrum during inflation. We found that the addition of positive curvature in the early universe clearly affected the primordial spectrum, while leaving the later multipoles relatively untouched. The suppression of the early multipoles allows us to propose an explanation to the quadrupole anomaly. We also ran these modifications through a Monte Carlo simulation to determine the likelihood parameters for the positive curvature. The likelihood graphs (Fig. 7) show that positive curvature is allowed by the Planck data, and in fact suggest that a small amount curvature could be preferred. However, the error bars for the curvature allow for zero curvature. Our solution to the quadrupole anomaly has only a 1-2 σ confidence level, which will be difficult to improve. The precision of the Planck data means that the bulk of the uncertainty comes from cosmic variance, which is by nature extremely difficult to diminish.
Further Study We are using Monte Python to run Monte Carlo simulations on the new energy sectors we added. We are also planning a series of Monte Python runs to vary both the equation of state of the new matter source, along with how much of it exists in the universe. This will allow us to determine with more accuracy the likelihood parameters for not only positive curvature, but the negative cosmological constant and other exotic matter sources as well. We also plan to take a more in depth look at the relation between different variables to determine how they are related to each other.
Gilmartin
The Manhattan Scientist, Series B, Volume 5 (2018)
281
ωcdm
0.125
0.119
0.112
ns
0.988
0.969
0.949
τreio
0.169
0.112
Ωk
0.04 0.0197
-0.00327
-0.0263
αs
0.000576
0.00032
zreio
0 17
12.1
7.15
ΩΛ
0.772
0.701
0.631
H0
84.5
72.7
60.8 2.16
2.23
100 ωb
2.3 0.112
0.119
ωcdm
0.1250.949
0.969
ns
0.988 0.04
0.112
τreio
0.169 -0.0263
-0.00327
0.0197 0
Ωk
0.00032 0.000576 7.15
αs
12.1
17 0.631
zreio
0.701
ΩΛ
0.772 60.8
72.7
84.5
H0
Figure 7. Monte Python likelihood triangle map
We have been using personal computers for the task, but we will be using the Dionysus supercomputer here at Manhattan College to dramatically increase our computation power. We also wish to investigate the possibility of non-Gaussianities in the CMB data. Detecting and identifying these non-Gaussianities would allow us to model inflation with greater accuracy. Unfortunately, the Monte Carlo simulator is built to work with Gaussian probability distributions, meaning we will have to search for different software in order to look for these.
Conclusion Sources like the Planck satellite have ushered in an era of precision cosmology, allowing for very strong constraints to be placed on the values of various cosmological parameters. Meanwhile, public resources like CLASS and Monte Python allow anyone with the knowhow to conduct their own experiments without needing to pay for expensive software. The data published by Planck is the best possible measurement of the TT spectra. It cannot be improved further due to cosmic variance. However, we can still improve our knowledge of the
282
The Manhattan Scientist, Series B, Volume 5 (2018)
Gilmartin
universe through CMB polarization large scale structure surveys. This will tell us more about the history of matter and energy in the universe.
Acknowledgments The author wishes to thank the Jasper Summer Research Scholars Program for funding this research. He would also like to thank Dr. Bart Horn, for guiding him through the confusing but captivating science of cosmology. Furthermore, he would like to thank Drs. Claire Zukowski and Raphael Flauger for very helpful discussions in understanding the CLASS software and getting it running, and the ESA and the Planck Collaboration for the providing the data used in this project.
References [1] P. W. Graham, B. Horn, S. Rajendran, and G. Torroba, “Exploring eternal stability with the simple harmonic universe,” JHEP 1408, 163 (2014) [arXiv:1405.0282 [hep-th]] [2] P. W. Graham, B. Horn, S. Kachru, S. Rajendran, and G. Torroba, “A Simple Harmonic Universe,” JHEP 1402, 029 (2014) [arXiv:1109.0282 [hep-th]] [3] B. Horn, “Positive curvature and scalar field tunneling in the landscape,” submitted to Phys. Rev. D, (2017) [arXiv:1707.03851 [hep-th]] [4] Planck Collaboration: Y. Akrami et al., “Planck 2018 results. I. Overview and the cosmological legacy of Planck,” (2018) [arXiv:1807.06205 [astro-ph.CO]] [5] W. Hu, “Lectures Notes on CMB Theory: From Nucleosynthesis to Recombination,” (2008) [arXiv:0802.3688v1 [astro-ph]] [6] D. Blas, J. Lesgourgues, and T. Tram, “The Cosmic Linear Anisotropy Solving System (CLASS) II: Approximation schemes,” JCAP 1107, 034 (2011) [arXiv:1104.2933 [astroph.CO]] [7] T. Brinckmann and J. Lesgourgues. (2018). “MontePython 3: Booster MCMC sampler and other features.” [arXiv:1804.07261 [astro-ph.CO]] [8] B. Audren, J. Lesgourges, K. Benabed, and S. Prunet. “Conservative Restraints on Early Cosmology: an Illustration of the Monte Python cosmological parameter inference code,” JCAP 1302, 001 (2013) [arXiv:1210.7183 [astro-ph.CO]] [9] R. Bousso, D. Harlow, and L. Senatore, “Inflation after False Vacuum Decay : observational Prospects after Planck,” Phys. Rev. D 91, 8 (2015) [arXiv:1309.4060 [hep-th]] [10] Planck Legacy Archive (2015). [CMB Power Spectrum Data]. Published data. Retrieved from https://pla.esac.esa.int/https://pla.esac.esa.int/#home
Comparison of Monte Carlo generators for Higgs decay processes Sarah Reese∗ Department of Physics, Manhattan College Abstract. Interpretation of the experimental results from the ATLAS will be done through the comparison of simulations made using Monte Carlo generators. Beyond the Standard Model (BSM) physical phenomena at current energy levels, 13 TeV, is best interpreted in the context of Effective Field Theory (EFT) models. Many BSM EFT models of Higgs interactions have been made in anticipation of LHC Run-2 results. Before any of these models can be used to understand detector results their generators must be validated. The following is a report on the validation of a particular generator for the Higgs Basis particle models. If validated, the generator will ultimately be used to create samples for analysis in future work.
Introduction So that the reader can understand the nature of the physics at play and the motivation behind the project, a few concepts must be introduced. Qualitative or conceptual descriptions will be used, as often as possible, instead of mathematical ones for the sake of the assumed reader and as a reflection of my current level of physics knowledge. The Standard Model (SM) Lagrangian is the most complete description available for the interactions between the elementary particles which make up matter, leptons and quarks, and three of the four fundamental forces, strong, weak, and electromagnetic, gravity being not included. Experimental tests of SM have been wildly successful, most famously with the detection of the Higgs boson at CERN [1, 2]. Now, in the search for an ever more fundamental and comprehensive description of the physical world, one must search for new physics. New descriptions of phenomena not described or explained in the SM generally fall under the catch-all of Beyond the Standard Model (BSM). For the purposes of current research, BSM physics is typically described mathematically through an Effective Field Theory (EFT). This is necessitated by the present energy constraints of current particle colliders. The LHC tops out at a center of mass energy of 13 TeV. EFTs are low energy approximations of underlying theories. They are important tools because they simplify phenomena when a complete theory is unavailable; the SM itself is an EFT. Just like how one can accurately understand how a ball drops without quantum gravity, or how a muon decays with only the Fermi theory, one can also search for anomalous Higgs physics without a new fundamental theory [3]. EFTs for BSM physics are an expansion of the SM Lagrangian to additional degrees of freedom with each dimension being suppressed by a factor of one over the energy scale to the n − 4 power, where n equals the degrees of freedom. The current Higgs sector EFT takes the form of Eq. 1 [4], ∗
Research mentored by Rostislav Konoplich, Ph.D.
284
The Manhattan Scientist, Series B, Volume 5 (2018)
X c(5)
(5) Oi
X c(6)
(6) Oi 2 Λ
X c(7)
(7) Oi 3 Λ
Reese
X c(8)
(8)
(1) Λ One aspect of BSM physics is the violation of the symmetry between charge conjugation and parity (CP). CP-symmetry can be explained best by example. If one took a particle anti-particle pair, swapped their charges, and inverted their space-time coordinates with those of their mirrorimage, then the laws of physics would still apply to the pair in the same way. If this is found to not always be true then the symmetry is considered to be “broken.” This work centered on CPviolation in Higgs decays. Supersymmetry predicts a spin-0, CP-odd Higgs boson, A0 , in addition to the CP-even spin-0 Higgs, H 0 , already known in the SM. CP-odd or CP-even denotes the same familiar nomenclature. A function is symmetrically even if f (x) = f (−x) and symmetrically odd whenf (−x) = −f (x). Diagrams 1 and 2 below show this visually. Leff = LSM +
i
+
i
+
i
+
i
Λ4
Oi + ...
CP-violation would tell us that matter and anti-matter behave differently. It is built into the SM with the flavor mixing CKM matrix. Therefore, it could be our first clue in understanding the matter/anti-matter asymmetry in the universe, the question of why there is so much more matter than anti-matter in prevalent today when at the early stages of the universes they were present in equal amounts.
Generators Monte Carlo generators are an important tool in particle physics. They provide a theoretical expectation for detector results. An accurate Monte Carlo simulation demonstrates an understanding of the detector, the theory, and the ultimate results. They are called Monte Carlo generators because they use the statistical techniques of the namesake in their algorithms to obtain numerical results. Generators used in high energy physics work roughly in the following way. A Lagrangian is written and matrix elements are created from the information within. A matrix element is a superposition of basis vectors describing the states of particles. The absolute square of a matrix element is necessary for calculating the cross-sections and widths of particle interactions. Calculations of cross-sections, the probability of collisions, and widths, the probability of decay, requires integration over a multi-dimensional phase space. Monte Carlo generators accept an input for a given process, create the matrix element, and then perform the integration. The integration is done numerically a number of times using random values to create a statistical result which can be compared to the results from a detector.
Reese
The Manhattan Scientist, Series B, Volume 5 (2018)
285
The particle physics models discussed in this paper are the Higgs Basis and Higgs Characterization [5, 6]. EFT Lagrangians are written using an operator basis in a mass eigenstate with different Wilson coefficients for the CP-even and CP-odd Higgs interactions. A mass eigenstate simply means that the observable operators are parameterized such that they have a direct relation to measurable physical quantities. The degrees of freedom correspond to the mass dimensions of each observable operator. In this work we consider two Monte Carlo generators. The first being a standalone decay simulator, Hto4l [6]. The other is a module in MadGraph5 for the Higgs Characterization (HC) [5]. h g2v2 + − (g 2 + g 02 )v 2 Lhvv = [(1 + δcw ) Wµ Wµ + (1 + δcz ) Zµ Zµ v 2 4 g2 + − g2 + − + Wµν + c̃ww Wµν W̃µν + cw g 2 (Wµ− ∂ν Wµν + h.c.) + cww Wµν 2 2 p e g 2 + g 02 gs2 a a e2 g2 + g2 + cgg Gµν Gµν + cγγ Aµν Aµν + czγ Zµν Aµν + czz Zµν Zµν 4 4 2 4 + cz g 2 Zµ ∂ν Zµν + cγ gg 0 Zµ ∂ν Aµν p e g 2 + g 02 gs2 a a e2 g 2 + g 02 + c̃gg Gµν G̃µν + c̃γγ Aµν õν + c̃zγ Zµν õν + c̃zz Zµν Z̃µν ] 4 4 2 4
(2)
Hto4l uses the Higgs Basis Lagrangian shown above by Eq. (2). The Higgs Characterization (HC) Lagrangian is shown below in Eq. (3). The HB 1 V µ + −µ L0 =X0 {cα κ SM gHZZ Zµ Z + gHW W Wµ W 2 h i 1 eµν − cα κHγγ gHγγ Aµν Aµν + sα κAγγ gAγγ Aµν A 4 i 1h eµν − cα κHZγ gHZγ Zµν Aµν + sα κAZγ gAZγ Zµν A 2 i 1h − cα κHgg gHgg Gaµν Ga,µν + sα κAgg gAgg Gaµν G̃a,µν 4 i 11h − cα κHZZ Zµν Z µν + sα κAZZ Zµν Zeµν 4Λ 11 + + − cα κHW W Wµν W −µν + sα κAW W Wµν W −µν 2Λ 1 − cα [κH∂γ Zν ∂µ Aµν + κH Zν ∂µ Z µν + (κH Wν+ ∂µ W −µν + h.c.)]} (3) Λ has the same field content and the same linearly realized SU (3)C × SU (2)L × U (1)Y local symmetry as the SM. The higher-dimensional operators are organized in a systematic expansion in D, where each consecutive term is suppressed by a higher power of Λ [7]. It is clear that they are quite dissimilar. A paper by Falkowski et al. [4] includes a set of translation formulas, Table 1, between
286
The Manhattan Scientist, Series B, Volume 5 (2018) Table 1. Parameters included in HB
Reese
Reese
The Manhattan Scientist, Series B, Volume 5 (2018)
287
the Wilson coefficients of HB and HC [4]. In HC, an effective Lagrangian is constructed below electroweak symmetry breaking scale, where SU (2)L × U (1)Y is reduced to U (1)EM . Early work on this project consisted of correcting those formulas and checking for typos. After translation, the HB and HC Lagrangians are made identical. It is important to note from Table 1 that HB includes some dependent parameters. When translating, one should be careful to note which parameters might also be included. For example, work with the W + or W − bosons requires one to use cww or c̃ww coefficients. Those are dependent on czz , cza , caa . In order to include the desired interactions, one must also alter those parameters.
Validation We chose to focus on Higgs decays into two muon anti-muon pairs. This is because a same flavor decay has the possibility of lepton interference, when the generator mixes up the muons, and we wanted to stress the generator. Muons are also readily detected experimentally, the outer layers of both ATLAS and CMS are huge muon spectrometers, so my test samples could be useful in the future. When a Higgs decays into two vector bosons, before they eventually decay into a lepton pair, the angle of separation is measurable. The angle between decay planes, denoted by φ, is especially sensitive to CP-violating anomalous Higgs couplings. Because of this fact, we used φ distributions for comparison. For the sake of time efficiency, we only considered Higgs production from gluon-gluon fusion and Higgs decays into two Z bosons. Four parameter sets were chosen corresponding to an SM, CP-even, CP-odd, and a CP-mixed case. Parameters were chosen in the HB, and then translated into HC. An HB generator was considered to be valid if, after translation, for any arbitrary parameter setting its distributions matched those of our HC generator. The HC generator was used as the benchmark because it had been previously validated in other work. H to4l The comparison process was done in a few steps. First, I would generate a sample of a given parameter set in both Hto4l and MG5. Second, I would apply an analysis code, written in C, to reconstruct the Higgs, the two Z bosons, and any other kinematic variables for histograms. Third and finally, I would compare the histograms, usually for the φ distribution, of both simulations to check for agreement. Simply put, we considered a generator valid if its distributions matched oneto-one with distributions made with HC in MG5. We checked for agreement first by comparing distributions at the SM level, meaning only the SM portion of the EFT Lagrangian contributes to our process. A same-flavor case was chosen because the φ distributions would be asymmetric and include lepton interference, as previously stated. For comparison, we chose more or less arbitrary parameters in each of the CP cases. Generations were made with 1 million events. Table 2 shows the settings for each simulation. Figs. 1 through 4 show the agreements or disagreements between distributions.
288
The Manhattan Scientist, Series B, Volume 5 (2018) Table 2. Settings for various simulations Standard Model
SM + CP-even
SM + CP-odd
SM + CP-mixed
HC
κSM =1.414214 cosα = 0.707107
κSM =1.414214 cosα = 0.707107 κHZZ =-6.2978 κHda =4.7225
κSM =1.414214 cosα = 0.707107 κAZZ =-31.4888
κSM =1.414214 cosα = 0.707107 κHZZ =-6.2978 κHda =4.7225 κAZZ =-31.4888
HB
δcz = 0
δcz = 0 czz = 10
δcz = 0 c̃zz = 10 (-10)
δcz = 0 czz = 2 c̃zz = 10 (-10)
Figure 1. φ, Standard Model (SM)
Figure 3. φ, SM + CP-even
Figure 2. φ, SM + CP-odd
Figure 4. φ, SM + mixed CP
Reese
Reese
The Manhattan Scientist, Series B, Volume 5 (2018)
289
Efficiency was an important factor in our comparison though we didn’t rule out any generators because of long computing time. Hto4l is very fast for what it does, since it only calculates Higgs decays. At one million events, computation took about half to three-quarters of an hour in the ZZ ∗ channel. More complicated processes such as Zγ or γγ take much longer. This is because those processes involved photons, which are massless, introducing infinities into the matrix element calculation. Another interesting feature of Hto4l is that it can generate either the full quadratic or only the linear component of the matrix element. As previously seen in Eq. 1, expansion terms are suppressed by factors of Λ1 . The Λ1 , Λ13 terms from the Lagrangian do not really contribute so they drop out. Leaving us with matrix elements in the form of Eqs. 4 and 5. The Λ14 , Λ16 and Λ18 terms include operators of dimension higher than 6 These contributions are unknown. It is sometimes convenient or crucial to not include those terms. A linear matrix element would only include up to the Λ12 terms in the complex-square of the matrix elements. M = gsm Msm +
1 1 1 1 + g6 M6 + 4 g8 M8 , M + = gsm Msm + 2 g6 M6+ + 4 g8 M8+ 2 Λ Λ Λ Λ
2 |M |2 = M M + =gsm |Msm |2 +
1 + + g g M M + M M sm 6 sm 6 6 sm Λ2
1 2 + [g |M |2 + gsm g8 (Msm M8+ + Msm M8 )] Λ4 6 1 1 2 + 6 g6 g8 M6 M8+ + M6+ M8 + 8 gsm |M8 |2 . Λ Λ
+
(4)
(5)
Including only linear terms drastically changes the distributions, so it was important to us find a way to generate HB in HC. We had to find a way of subtracting the quadratic terms from the generation. Newer releases of MG5, not used in this study, include an option to exclude quadratic terms. Since this is not trivial, we decided to do it ourselves. It was relatively simple when done at the histogram level. To get only linear terms in the distributions for HC, first one has to create two samples. One sample for the full simulation, as in SM plus EFT, and one sample for just the BSM terms. Then all one has to do is subtract the ratio of cross-sections of the BSM over the complete, SM plus EFT. Figs. 5 and 6 show quadratic and linear forms of simulations in both Hto4l and MG5. It seems as if Hto4l is in pretty good agreement with MG5, but here were some small issues. The most obvious one should be clear after examining the plots in Fig. 2. Looking closely, one will notice that the sign of the CP-odd parameters, c̃zz and kAzz , in Hto4l and MG5, respectively, have the same signs. When translated, the parameters should have opposite signs. Once compared, the CP-odd HB and HC simulations were not in agreement. In order for them to agree, Hto4l simulations needed to be flipped over the central vertical axis. So we swapped the sign of the coefficient. After that, the simulations were in perfect agreement. Fig. 7 below shows how the opposite sign distributions appear.
290
The Manhattan Scientist, Series B, Volume 5 (2018)
Figure 5. φ, SM + CP-odd including quadratic terms
Reese
Figure 6. φ, SM + CP-odd including only linear terms
Figure 7. φ, Hto4l sign error in CP-odd terms
We contacted both Hto4l and MG5 teams about the issue, in hopes that they could fix the error. Neither said they could find any issue in the program. Other preliminary work with another HB generator did not have the sign error when compared with MG5. This indicates to us that the sign problem originates with Hto4l. The other issues with Hto4l are not so substantial. When simulating a Higgs decay in the Zγ channel one must make cuts on the lepton pair masses. Cuts mean that we set a condition such that the lepton masses must be greater than some minimum. The cuts are important for efficiency because it lets us avoid integrating over infinity with the photons. Each boson, weak or photoelectric, decays into two leptons, muons in our case. Reconstruction of the pair will give the invariant mass of their mother particle. MC generators, at least the ones I use, write particle data to something called a Les Houches (LHE) file. It is just a way of formatting the information so that it can be easily read, understood, and used for analysis. In the LHE file, the particles are identified and listed with their masses, four-momentum, and mother. The lingo is that an ordered pair of leptons is one in which the particles with the same mother are listed one after another. A disordered pair would be when that is not the case, obviously. It is all arbitrary really.
Reese
The Manhattan Scientist, Series B, Volume 5 (2018)
291
As stated before, the mother particle invariant mass comes from the combination of its daughter particles. Since we made cuts on the lepton pairs, that combination is very important. Ideally the generator should be able to distinguish between ordered and disordered pairs during combination. Unfortunately, in Hto4l this does not seem to happen. Hto4l combines every pair as if it were an ordered pair. This makes the cuts irrelevant and ruins the distributions. The figures below show an example of an ordered and disordered combination error. In Fig. 8, the pairs are ordered and the combination is correct. In Fig. 9, the pairs are disordered and combined incorrectly. The masses of their mother particles are incorrect and therefore not cut when they should have been. When the pairs are combined as an ordered pair, the listed mother particle mass matches the calculated mass. In Fig. 8, the pairs are ordered and the combination is correct. The opposite holds true in Fig. 9. Another issue one can see in the figures is that even though we are working, this time, in the ZÎł channel the mother particles are both written as Zbosons. You can tell because the PID numbers, the first column, are both 23. These are not really a big deal because they can be fixed in the analysis code, but itâ&#x20AC;&#x2122;s nonetheless worth mentioning.
Figure 8
Figure 9
Conclusion The motivation behind this work was to find a suitable generator in the Higgs Basis for future studies searching for CP-Violation in Higgs decays. The MC generator Hto4l will simulate such processes in the Higgs Basis. We validate Hto4l against another generator, MadGraph5 with the Higgs Characterization model. In order to compare the two, the HB Lagrangian had to be translated to the HC Lagrangian using the formulas developed by Falkowski et al. [7]. Two inconsequential errors were found in the way Hto4l writes events to an LHE file. The first of those was an error
292
The Manhattan Scientist, Series B, Volume 5 (2018)
Reese
in the mother particle identification in the LHE files. The second was in the incorrect combination of the masses of lepton pairs. The major issue is the contradiction between the sign of CP-odd parameters in Hto4l and MG5. This can be seen in the MG5 samples, the same parameters in HC will be of the opposite asymmetry, compared to that of HB. We concluded that Hto4l is a suitable generator for SM samples and mostly satisfactory for BSM work.
Acknowledgments The author would like to thank Dr. Konoplich for all his patience and mentorship as well as for giving the opportunity to do this work. She also thanks: Kirill Prokofiev, Ki Lie, Tak Shun Lau, Jaiwei Wang, and Nikita Belyaev for their collaboration, and Alexandre Sahkarov, Allen Mincer, and Peter Nemethy for letting her share their office. Finally, she thanks the School of Science at Manhattan College for creating a place for undergraduate research. This work was supported by the National Science Foundation under Grant No. PHY-1402964.
References [1] ATLAS Collaboration Collaboration, G. Aad et al., “Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC,” Phys. Lett. B716 (2012) 1–29, [arXiv:1207.7214]. [2] CMS Collaboration Collaboration, S. Chatrchyan et al., “Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC,” Phys. Lett. B716 (2012) 30–61, [arXiv:1207.7235]. [3] J. Brehmer. “Higgs Effective Field Theory.” Particle Physics beyond the Standard Model, 11 Jan. 2016, Heidelberg RTG. [4] A. Falkowski. “Higgs Basis: Proposal for an EFT Basis Choice for LHC HXSWG.” LHC Higgs Cross Section Working Group Internal Note, 21 Nov. 2017, pp. 12–33. [5] P. Artoisenet, P. de Aquino, F. Demartin, R. Frederix, S. Frixione, F. Maltoni, M. K. Mandal, P. Mathews, K. Mawatari, V. Ravindran, S. Seth, P. Torrielli, and M. Zaro. “A Framework for Higgs Characterisation.” Journal of High Energy Physics, vol. 043, no. 11, 2013, doi:10.1007/jhep11(2013)043. [6] S. Boselli, C.M. Carloni Calame, G. Montagna, O. Nicrosini, F. Piccinini, and A. Shivaji. “Higgs Decay into Four Charged Leptons in Presence of Dimension Six Operators.” Journal of High Energy Physics, DOI: 10.1007/JHEP01(2018)096[arXiv:1703.06667] [7] A. Falkowski, B. Fuks, K. Mawatari, K. Mimasu, F. Riva, and V. Sanz. “Rosetta: an operator basis translator for Standard Model effective field theory,” Eur. Phys. J. C75 (2015), no. 12 583, [arXiv:1508.05895].
Writing science research manuscripts for publication Lilliana McHale∗ Department of English, Manhattan College Abstract. Composing manuscripts for scientific publications is complicated and specific to a journal’s particular area of study. The process involves understanding all aspects of the research, so that the manuscript conveys the research accurately to other scientists. Many young writers struggle with the construction of tables and figures, yet these presentations are important to convey the technical information clearly. Construction of manuscript for publication involves: (1) compiling all data, (2) creating data summaries with corresponding tables and figures, (3) writing initial drafts, (4) editing to bring materials into focus, and (5) proofreading to produce a manuscript ready for publication.
Introduction Composing manuscripts for scientific publications is complicated; however, doing so is necessary so researchers can convey their scientific results in a manner that other scientists understand. Protocols are well known among scientists, and each journal has specific requirements. These protocols must be learned by young scientists if they want to be successful authors who can present their findings to the scientific community. Young scientists who do not follow exact protocols of journals find themselves at a disadvantage. This guide conveys these protocols to young researchers to help with the process of publication. The guide provides guidance on necessary preparation for writing, as well as instruction on composing various sections of manuscripts and well-tested dos and do nots of the cumbersome process. An atypical method for citations is given herein. References are given by the author’s last name followed by a number. The number refers to the page number the author is referring to. This method is used with the anticipation that the author will acquire one of the references listed.
Before Writing Define andnarrow subject area Before starting to write your manuscript, you must define and bracket your subject area. In other words, you need to set limits as to what you will review and cover out of the vast possible subject areas. You must also consider what background information you need to highlight to set the stage for your research results. You have completed your research, to some degree, and want to present your research in a meaningful way, so your results will be accepted by scientists in the field. Ask yourself these questions to prompt writing: What do scientists know about this topic? What big picture questions have we not answered yet? Why is this important science to study right now? (Turbeck 419) Once you have figured out what you want to include within your manuscript, state what will not be covered to set limits for your manuscript as well. ∗
Research mentored by Lance Evans, Ph.D.
294
The Manhattan Scientist, Series B, Volume 5 (2018)
McHale
Voice and Audience Science writing is not about explaining your process, rather it is about what you want your audience to learn after reading your research. What you learn is important but being able to disseminate information is critical for your research to be successful. You must decide what you want your audience to learn and what story, or thread, of research you will present to them. Before beginning the writing process, determine your audience as best you can. Ask yourself what level of understanding does your audience have on the subject? What level of science do you need to start them out with? By catering to your selected audience, readers can be more easily appealed to. Start by asking yourself these questions: “Who is my audience? What do readers hope to take away from my writing? What do I want them to get from my writing?” (Turbeck 419) Selecting Scientific Papers, Creating your Outline Once you have defined your topic’s bounds, you must compile scientific papers, that are related to your research, and categorize them based on how they fit into your research outline. Scientific papers function as a way to direct your manuscript towards acceptability within the scientific community. Review the literature thoroughly enough that you can make an outline to discover if the subject you have defined is too broad or too narrow. Each scientific paper needs to be broad enough that it can be outlined effectively, to find out where your document will focus. If the materials in the scientific papers do not fit, you might need to alter your outline to add materials from scientific papers into you document. Ask yourself, did you set appropriate bounds? As you discover new sources that fit within your general research area, you may need to reset your bounds. Information within scientific papers will be part of the subject, but the papers do not make any definite decisions for your new manuscript, as this is your research; other research simply acts as a fact backer for your research. Peer-Reviewed Sources The research aspect of literature is key to writing a strong manuscript. Finding sources is not enough to ensure the sources are scholarly, or even acceptable to cite within your manuscript. Formal scientific journals, and most books, have gone through the necessary peer-reviewed process that is demanded for scholarly research. Many articles online have not been checked for accuracy and thus the science within these studies could be unaccepted or discredited. Publications that have not been peer-reviewed can completely discredit your manuscript and argument, as there could be false science backing your manuscript (Pechenik 25). You should search for, and use, only high-quality sources that are interrelated to your topic, so you do not waste your time reading or gathering papers from non-peer-reviewed sources (Pechenik 31). Young scientists must locate many peer-reviewed papers on their topic for several additional reasons. These scientific papers will serve as models to supplement the creation of a general outline for your document. Use these sources as references throughout your writing to help standardize your own manuscript. Scientific papers will also offer assistance with word choices for your own
McHale
The Manhattan Scientist, Series B, Volume 5 (2018)
295
manuscript subject. Word choice and topic introduction are important to appeal to your audience, as it is necessary to talk to them differently than you would discuss your research with your coauthor. Your sources might also provide ideas about where to submit your manuscripts, since they appear in journals who have published research on research topics similar to yours. Additionally, scientific papers, with topics most like your own, will likely provide a stack of references that can supplement your own document through their quality sources. Many scholarly sources are available on Google within their scholar source section, which is accessible by visiting https://scholar.google.com, and then searching for articles or books that are relevant. Manhattan Collegeâ&#x20AC;&#x2122;s online library search page also offers a Discovery Search that allows students to search within all free sources for students that are paid for by Manhattan College. Within Discovery Search, students can search for keywords, authors, and titles. An additional online source, with an expansive amount of literature, is the Internet Public Library, which is accessible at http://www.ipl.org. Effectively Utilizing Sources Putting science in context is important, especially for posters and papers, as both make science relevant and understandable for those who do not know the topic. Additionally, every statement of fact or opinion must be supported with in text citations; any source not cited in text does not belong in your References. The reverse of that is true as well; any source that is cited in the manuscript absolutely needs to be placed within your References. It is essential, when reading literature, to take notes in your own words. If you are copying your sourceâ&#x20AC;&#x2122;s words, you do not understand the source. Otherwise, you would be able to put what that author is saying into your own words. In your own words, you will understand material better and will thus relay ideas more effectively. You want to commit to overcoming and understanding your struggles, as doing so will help you to anticipate similar struggles for readers (Pechenik 4). Your own struggles will pinpoint where your writing is weak. Do not plagiarize or you will suffer serious consequences. Whenever you use an idea, thought, or fact proved by someone else, you must cite the source. Citing, especially peer-reviewed sources, is important in demonstrating your mastery of the specific topic you are writing about, as you have spent a great deal of time with this topic (Pechenik 5). Do not quote your source; scientific literature rarely utilizes direct quotes. Instead, describe what you desire to quote in your own words (Pechenik 4). Doing so will provide a platform for you to showcase what you have learned throughout your research. Additionally, paraphrasing allows you to add your own research to the topic, without misinterpreting or manipulating the quote. While researching, it is important to write down anything and everything that stands out to you. Even if you do not see how your writings will fit into your manuscript, taking notes will allow you to remember everything you have done with little searching, reminiscing, or plagiarizing. Your notes will force you to get comfortable with writing your science down and prove that writing about science is not scary after all! Ideas to record:
296
• • • •
The Manhattan Scientist, Series B, Volume 5 (2018)
McHale
The design and goals of the research. Initial thoughts and questions on the research. Observations alongside all numerical data and calculations. Reactions to your data and overall experiment as research progresses (Pechenik 35).
Organizing Thoughts and Materials You will need ample time with no distractions to delve into writing, think of when you will have a large chunk of time. Before writing, organize your lab materials to know what you want to write and refresh the subject by looking over the notes you took throughout the research process. This will eliminate unnecessary “thinking” time to start writing and will keep everything well organized. Just because you have now entered the writing stage, however, does not mean that you are done with notetaking. In fact, it is just the opposite, continue to jot down ideas and thoughts you have throughout the entire writing process as these tidbit ideas may be very supplemental to your overall writing. Review and Compile To know what you want to write, you must be comfortable with all of your data. It is necessary to examine all of your data critically. Scrutinizing your data will force you to unpack your data to best understand all possible interpretations of it (Pechenik 35). It is smart to look over your data with someone else, possibly your co-author or another scientist who understands your topic area, to see if they find similar interpretations from your data. To confirm you are close to writing, ensure you understand the hypothesis of your experiment. You must know how you gathered data and be able to write information in a way that displays that data logically and meaningfully. You must to organize data in a clear and logical way to address each hypothesis clearly and effectively. To avoid confusion, create a flow chart or table for your methods section. Keep flow charts brief; they will only benefit yourself and will not be in your final product (unless they provide an overview to your reader that helps them far beyond the way your writing helps). Whatever organization method you utilized will remain descriptive enough that you can remember materials used, without an additional reference guide for yourself. For each hypothesis, make tables or figures to organize and present data. Data must be presented clearly and completely. At your desk, write each hypothesis on a piece of paper. Next, assign pages of data, tables, and figures to their respective hypothesis. Before writing, answer all questions about procedures, materials, and results. Do not start writing until this is done as you do not want to confuse yourself, or present false information to the reader (Evans 2).
Writing The major headings for publications in most scientific journals are: Abstract, Introduction, Materials and Methods, Results, Discussion, References, Tables and Figures. Each heading, unless
McHale
The Manhattan Scientist, Series B, Volume 5 (2018)
297
you are specifically told otherwise, will have its own line and be capitalized. These major headings may be subtle, but they are all nevertheless present in this order in all scientific journals. However, each scientific journal has a specific style. Below is a general description of what is required for each of these sections. Introduction The purpose of the introduction is to introduce a topic and lay out expectations for what will come. The introduction section, alone, includes information that prepares readers for finishing arguments (Pechenik 199). Think of the introduction as the beginning of a funnel; put your research in a large-scale context for the average layman to understand and, then, narrow in to focus on your topic and hypothesis (Turbeck 419). By initially utilizing a wide view, there is a extensive range of possible implications for your research. You can then tailor the needed background information, for your reader to understand your manuscript, by providing ideas that are both understandable and relevant to the topics you will cover (Turbeck 419). Just because you are only covering a small section of a much larger science topic does not mean that you will focus your research only on that topic. Additional background information is important, as it can allow you to present your manuscript in a variety of ways. Presenting research in certain contexts allows for different opportunities for journal publication, as often manuscripts will be altered to fit criteria desired by specific journals. Once you have presented your needed background information, focus on the topic at hand by stating the research questions(s) your manuscript discusses. This problem can also be called “the knowledge gap” because it asks the answer to an unknown question (Turbeck 420). The knowledge gap presents a problem or asks a question that fits into a larger field (Turbeck 420). This is where you will convey how your document will answer some of the questions that are left open by this gap in knowledge. This is one of the most important parts of your manuscript as both your discussion and conclusion focus on resolving some of the knowledge gap that you are highlighting. The knowledge gap is behind your hypotheses and is the driving force behind how you pursue your experiment (Turbeck 420). A hypothesis is the answer to this question; the hypothesis tries to address this knowledge gap. After presenting the knowledge gap, you will define your hypothesis, methods, and design briefly. A hypothesis needs to be testable and convincing, with support from your research, and presented at the beginning of a scientific paper so the rest is spent supporting it. Give enough information and do not be elusive, but do not give too much information where this section will negate the addition of a materials and methods section. Here is where any needed justification for your research can come in. The introduction is generally 3-4 paragraphs. The first paragraph will introduce your topic. This paragraph should be written in laymen’s terms and be clear to any educated reader. Here is when you provide all necessary basic information that your reader must have in order to follow your manuscript. Remember to write for your reader!
298
The Manhattan Scientist, Series B, Volume 5 (2018)
McHale
The second paragraph will start to relate the topic of your manuscript to modern science. For example, include why this is an important topic to pursue in this time period. Highlight why this is critical for us now and what makes this specific research2 special, interesting, or different. Do not forget the point of good science is to receive funding and get published; this is a key section for grant funding! The third paragraph describes your unique approach to your subject. Here is where any questioning will be done. State and present your assumptions and then your individual hypotheses one by one (Evans 2). If an idea seems to run on, add said idea to your discussion, not introduction. Write an introduction for the study you complete, not the study you initially through about. It is okay to change your approach due to what your research presents. When presenting each hypothesis: The hypothesis will lead to, but not into, the manuscript’s body. Introduce each hypothesis and give a brief umbrella of information, if necessary. Create a theoretical and logical flow chart of topic to theses. For hypothesis 1, give the hypothesis a proper name and give its due alongside details. At end of hypothesis 1, transition to next hypothesis. Go to hypothesis 2 and follow the same rules that applied to hypothesis 1. Once done with the hypotheses, write a conclusion to bring back ideas from preexisting flow chart. • Wrap up hypotheses (Evans 2). • • • • • • •
Materials and Methods Using your flow charts, or whatever organizational method you decided, reflect on your process and write out your procedures. This explanation needs to be detailed clearly enough that readers could repeat the experiment on their own. Additionally, materials and methods cannot be written like a recipe; recipe style writing is popular with students who do not have enough time to put their actions into words, doing so is unprofessional. Students frequently struggle when attempting to explain their materials thoroughly enough for the reader to repeat the experiment. Authors forget that what they write, not what they think, is the information relayed to your reader. If materials must be explained in detail, take your time and put your words down logically. If you cannot put them down logically right away, put any ideas down on paper and come back to them once you are refreshed. Remember this is a topic you have been studying in depth, while readers may know little about it. Strong justifications for actions, alongside clearly written methods, are key for clarity and ease for readers. Understanding the implications of each scientific method is necessary, as you are trying to prove to readers that you used scientifically valid methods to come to your logical conclusion. Organize materials and methods using multiple subsections that are labeled with appropriate headers
McHale
The Manhattan Scientist, Series B, Volume 5 (2018)
299
for each procedure that the materials went through. These subsections can be reused in the results and discussion sections, so the reader can easily follow the threads between each hypothesis and their respective sections. Providing subdivisions within the method section allows readers to understand direct implications of each step through cause and effect. You are also providing steps so your research can be recreated. Without being able to recreate research, results are discredited and virtually worthless. It is essential to list model numbers and specific settings of all equipment used, while conducting research. In addition to simple experimental procedures, statistical analysis steps will be replicated here as well. This includes any computer analysis, or research of that nature. Ideally, all steps followed will be listed here. If the procedure was from a scientific paper, or an outside source, the source must be cited. There will also be a general description of said method followed, along with any alterations to the method. You will not regurgitate all steps from the cited source, but instead make the words your own. Enough information will be provided that readers can replicate your specific experiment without said source. Do not qualify anything, including errors. Any qualifications go in the materials and methods sections. Highlight as much as possible for readers in regard to the experiment. Clarify by providing information such as: organism used, site depiction, alterations in experiment, measurements, and analysis. This section is typically written in past tense but can use active and passive voice in order to avoid repetition, when needed. Results Section To write the results section, you must identify key sections of your research that are essential for crafting your research arch, the progression of your research to your conclusion. By doing so, you allow yourself to create descriptive sentences that summarize results of each graph, table, or image (Turbeck 422). Results will be written in past tense (Pechenik 182). Characteristics of figures Before writing, craft summaries about each data set. This forces you to focus each data set around one central idea and will avoid confusing readers. Each figure should be able to be summarized in a few concise sentences. Doing so will make your life easier, as you will not need to refresh yourself with data and can use past thoughts to help write figure captions or table summaries (Pechenik 38). Describe data through a biological lens and use statistics to back your statements up. Statistics, with explanations, lend credibility to research (Turbeck 422). Your job, as a scientist, is to try to figure out a way to convert data into scientific terms and reflect back on your data as conclusively supportive. Results sections will address: • Why and how the experiment was done. • What the new results mean and how the new results were accepted. • The data, but do not interpret data here (Pechenik 158).
300
The Manhattan Scientist, Series B, Volume 5 (2018)
McHale
Each hypothesis will be explained in at least one paragraph. Do not feel confined to just one paragraph; What to write for a hypothesis (repeat for each): • Restate the hypothesis. • Explain and justify any variation in your experiment. • Describe results alongside tables(s) or figure(s) so readers understand context of data to this experiment. • Explain hypotheses and whether results are congruent with what you expected (Evans 3). Use a simple sentence structure to describe the results of each hypothesis. Try creating bullet points about data, to ensure you hit all key points. Properly selected texts will support, or even repeat, ideas demonstrated in tables and figures. These pieces will support strong transitions (Evans 4). You must refer to each table or figure at least once throughout your manuscript; otherwise table or figure has no business being attached. To effectively write about data, jot down ideas about your tables or figures. You will reflect on your data and thoroughly explain data to readers in a way that will help them see what you want them to see within your data. Explaining your data allows you to elaborate on these ideas within the discussion to further your document’s implications (Evans 6). Tables and Figures As the author of a scientific manuscript, you are responsible for making your own tables and figures. Graphs are not tables and figures. Each table and figure need an Arabic number, for example, Table 1, Fig. 1 (Evans 10). You must reference each table and figure you make throughout your document or the table/figure does not belong in the scientific thread you are creating. Ensure any mention of tables and figures is relevant to readers (Pechenik 84). Use the following guide to help decide whether a table or a figure will be most effective. Use a table to show: • Precise numerical values, or other pieces of specific data within a small area. • Comparisons and contrasts between data values or characteristics within or among related and/or shared characteristics or variables. • Presence or absence of a specific characteristic (Rodrigues 2). Use a figure to show: • Trends, patterns, and relationships between and/or across sets of data when the pattern, not the exact data, are most important; they will be displayed via graphs that feature data plots. • Summaries of research results using graphs that utilize data points such as maps or pie charts. • Visual explanations of progression of events, features, procedures, characteristics, via images, photos, maps or schematically drawn diagrams (Rodrigues 2). Each table and figure must be able to stand alone, meaning readers must be able to understand every aspect of the figure or table on its own without reflecting back to your manuscript. Any data
McHale
The Manhattan Scientist, Series B, Volume 5 (2018)
301
that is presented to readers will be clear in both meaning and make up (Pechenik 37). Readers should not have to move around within the text to understand data. If the study is detailed and uses subtitled sections, reuse the same subtitles here (Turbeck 422). Tables and Figures should be placed in your manuscript in numerical order. The first table presented will be labeled “Table 1” as you cannot present “Table 2” before you present readers with “Table 1” within both the text and figures’ representation. The same ruling goes for the figures as well. Everything that is not a table is a figure and should be listed as such. This means that all images that are not tables receive figure captions (Pechenik 159). Figure captions must include all information to describe said item completely. Captions should present the research questions that figures address (Pechenik 163). Data is not meant to take up space within a manuscript, but rather to back up your hypothesis (Pechenik 159). However, hypotheses can change based on good data. Do not analyze your data with a mindset focused on your initial hypothesis. Do not try to force your data to reflect your hypothesis. If possible, ask other scientists if they come to similar conclusions as you do with your data. Data should be arranged in a way that will reveal trends and add to your overall argument. Do not graph all data. Appearance matters for the presentation of tables and figures; avoid overcrowding plots. Use only 3 or 4 data sets per figure (Borja 3). Consider what size axis labels will be appropriate and easy to read. Symbol usage should be clear to readers. Tables and figures should not be boring or meaningless; instead, create a “supplementary material” section and add relevant, but unexciting, information here (Borja 4). Describing figures is key, as only showcasing tables and figures does not serve as an effective part of your results section. Do not make readers interpret data; tell readers what to see by saying what you would like them to see and then cite figures or tables to back up what you have said. It is important to note that each table needs a description and each figure needs a figure legend. Each axis will be labeled. Each graph requires an appropriate title that explains what was done. Tables have descriptions, which sit under the table to help guide readers. Tables also have titles to help guide readers. Where you choose to put certain information is critical for clear analysis of data for readers. Figures can have captions of any length. Figures will include figure captions, which can contain parameters like r2 values and equations of graph line(s). Adding additional items to figures will ensure that figures can stand-alone when being read. To ensure clarity, ask a fellow scientist to look over table(s) to assure they understand what they are viewing, as the Tables and Figures section must function on their own. Each graph needs to have clearly labeled units of measurement (Pechenik 162). Each graph needs to have numerical intervals which allow readers to understand what values of each data point is (Pechenik 162). Symbols use should have clear meaning for readers, to avoid any possible confusion (Pechenik 162).
302
The Manhattan Scientist, Series B, Volume 5 (2018)
McHale
Indicate species studied and sample size within graphs. Specificity is key for providing precise, clear, information to readers (Pechenik 181). Precision within graphs ensures readers do not have to leaf back to other sections ito understand figures or tables. Create graphs that look alike. Similar graphs ensure no variation for readers and helps to eliminate any ambiguity. All tables and figures must be originally made by you, or you must cite their origin if tables and figures are not your own (Evans 1). Images should be saved as compression algorithms: *.BMP, *.TIF, *.TIFF, *.PNG, or *.PDF, as raw files are ideal but often too large. Images need scale markers (Borja 4). Science journals only print polished tables and figures; this is what is expected of you. Copying and pasting Microsoft Excel worksheets is not adequate. Tables and figures should be within the Microsoft suite of programs. If the journal you are submitting your document to has a limit on tables and figures, or has any specific guidelines, you must follow them. Microsoft Paint can be used to edit and refine any issues with your worksheets. Falsifying data is an obvious no. False data will discredit your entire document and possibly your career; do not do it. Discuss any observed non-random variation when variation affects data. Trends are often important and will be reported. Do not dismiss any variations as unimportant (Turbeck 422). Negative results should not be seen as negative. Negative results simply do not support your current hypothesis. This is perfectly ok as you might change your hypothesis once you have gotten data, as data can often change the entire hypothesis and range of topic. Below are examples of a good and of an inadequate figure. In final publication, figures are going to be reduced in size. Therefore, when aMcHale figure is constructed in Microsoft Suite the fonts, 13 The Manhattan Scientist, Series B, Volume 5 (2018) data points, and lines should be disproportionately large, so that they will be legible when photoBelow is an example of a bad figure, one that would not be selected for publication. The reduced. Figures should limit white-space sofigure that theuse proper information canutilizes bewhite-space. presented effectively. does not parameters and poorly Additionally, the size of the 12
font is not going to be easy on a reader when published, thus this is an inadequate figure. While this McHale is the same figure technically as above, it does not represent the data effectively.
The Manhattan Scientist, Series B, Volume 5 (2018)
300 275 Number of Outliers
Number of Outliers
300
250
250 225 200 175
200 93
94
95
96
97
98
99
100
Accuracy (%) A good figure caption above figurefigure would read as such: would “Fig. 1. Relationship Figure 1.forAthe“good” caption read as between numbers of outliers from Validate Model programs as a function of predictor accuracies (%) from such:trees. “Relationship between numbers WEKA decision Data are shown in Table 1. The equation of the of line outliers was y = -9.90x + 1210; r2 = 0.89” (Bertoli 21).
from Validate Model programs as a function of predictor accuracies (%) from WEKA decision trees. Data are shown in Table 1. The equation of the line was y = −9.90x + 1210; r2 = 0.89.” (Bertoli 21)
150 0
10
20
30
40
50 60 70 Accuracy (%)
80
90
100
Figure 2. An inadequate figure, one that would not be selected for publication: (a) It does not use proper parameters and poorly utilizes white-space; (b) the font size will not be easy to read when published. While this is the same figure technically as Fig. 1, it does not represent the data effectively.
110
McHale
The Manhattan Scientist, Series B, Volume 5 (2018)
303
Additionally, figures should have the equation of the line in the figure caption, not in the figure. Figures and tables should be complete so they can be understood without reading the text. Below is an example of a good table. All of the numbers follow the same number of significant figures and line up accordingly, the table is left-justified and presents information in an effectively informative manner. All text has identical size and format. Again, tables should be understood without reading the text. Tables and Figures cannot be placed in the manuscript or the appendix out of numerical order. Table 1. Comparisons of standard deviation values, numbers of outliers and predictor accuracies for samples of Carnegiea gigantea. Predictor surfaces North-left troughs (90%)3 North-crests (95%) and east-right troughs (90%) West crests (97%) and east crests (97%) East-right troughs (98%) and west-crests (97%) South-right troughs (95%) and east crests (97%) 1 2 3
Standard Deviation1
Number of outliers1
Predictor Accuracy2
14.9 19.9 25.8 23.0 24.7
211 241 279 252 259
100 98.6 93.7 97.9 96.5
Standard deviations and numbers of outliers were determined with Validate Model. Predictor accuracies were determined with WEKA. Values in parentheses were the percentages used for WEKA decisions (Bertoli 18).
Below is an example of a poor table that is not suitable for publication. The problems are: (1) non-uniform significant figures, (2) font size and uniformity, (3) unaligned text, (4) incomplete title, and (5) lack of adequate citations (such as footnotes). Table 2. Comparisons of standard deviation values, numbers of outliers and predictor accuracies for samples Predictor surfaces North-left troughs (90%)1 North-crests (95%) and east-right troughs (90%) West crests (97%) and east crests (97%) East-right troughs (98%) and west-crests (97%) South-right troughs (95%) and east crests (97%) 1 2
Standard Deviation
Number of outliers
Predictor Accuracy2
14.92 19.9 25.8 23 24.7
211 241 279.0 252 259
100 98.6 93.7 97.9 96.5
Predictor accuracies were determined with WEKA Values in parentheses were the percentages used for WEKA decisions (Bertoli 18).
Discussion Many students struggle with understanding the function of the discussion. The discussion is not an area where you describe results. The results were already highlighted in the results section. Within the discussion you are expected to reference results, but do not repeat what you have already
304
The Manhattan Scientist, Series B, Volume 5 (2018)
McHale
stated (Evans 6). The discussion is difficult to construct because you are trying to convince your readers that you have an exciting topic, on which you have done different and exciting experiments that have produced plentiful results. First thing to do when writing the discussion is to attempt to interpret your results (Turbeck 422). Your goal is to tie together your introduction and results sections (Turbeck 422). Start by explicitly stating the main finding of your research (Turbeck 422). Reiterate ideas you are trying to focus on in the introduction to re-engage readers (Turbeck 422). Next, show how your experiment added something to the field and filled in the previously discussed knowledge gap (Turbeck 422). You will likely use a variety of verb tenses throughout the discussion. When describing your results, write in the past tense. For referring to a table or figure use present tense. Future tense should be used to describe something that will be done in the future. Within the first paragraph of the discussion, address questions and hypotheses that you brought up earlier in the manuscript. Tie each hypothesis to specific evidence from within your research (Turbeck 422). If you have written multiple hypotheses, show the implications of each, with explanations that highlight possible importance of each. Keep the discussion concrete by referring to results that back up your interpretations (Turbeck 422). This is where your previous research is important. You must start questioning yourself. Ask yourself: how do my results stack up to studies similar to mine? If there is variation, question where this could stem from and discuss any variation here (Turbeck 423). Get in the habit of questioning everything. Science does not prove anything. Science will always be modified and disproven without embarrassment (Evans 7). The purpose of science is not questing for truth, but a quest for knowledge; without questioning we cannot learn. Thus, the best scientists, and their science publications, ask many more questions than the scientist can ever answer (Evans 7). Learning to question is important as a scientist, even if the questioning stems from questioning your own ideas. If one was to reproduce the above paragraph via the American Physical Society accepted citation format it would look like this: â&#x20AC;&#x153;You must start questioning yourself. Ask yourself: how do my results stack up to studies similar to mine? If there is variation, question where this could stem from and discuss any variation here.1 Get in the habit of questioning everything. Science does not prove anything. Science will always be modified and disproven without embarrassment.2 The purpose of science is not questing for truth, but a quest for knowledge; without questioning we cannot learn. Thus, the best scientists, and their science publications, ask many more questions than the scientist can ever answer.2 Learning to question is important as a scientist, even if the questioning stems from questioning your own ideas.â&#x20AC;? 1
Turbeck 423
2
Evans 7
Some of these questions might be helpful when trying to start your discussion as questions
McHale
The Manhattan Scientist, Series B, Volume 5 (2018)
305
will lead to a strong discussion: • What did you expect to find and why? • How do your actual results compare with expected results? • How your actual results interact with each of your hypotheses? • How do you explain unexpected results? • How might you further this experiment? (Pechenik 188) Restate your hypothesis and give your results through the perspective of your methods. Next, present what you found without restating your results. Put results in context of what all this means for science. Go through everything, step by step, and come up with conclusions to write down. State them, even if you previously stated them in results (Evans 3). The following questions should be answered for each hypothesis: • Does the data support the hypothesis? • Could anything be added to help further the understanding of results? (add here if yes) • Are there ways to advance the used method? • Is there a way to complete the experiment more efficiently? • If your data does not support your hypothesis, what could have gone wrong throughout the process? • Were all aspects performed effectively? • What flaws occurred that affected the experiment? (Evans 6) References The Reference section should list all scientific papers you refer to in your manuscript in alphabetical order. Websites such as Bibme and Purdue Owl are good sources for brushing up on current citations rules. These sources can also provide bare-bones citations for creating a preliminary Reference section. See also pages 67-76 of A Short Guide to Writing about Biology Suggestions: • Format your in-text citations as follows: (Author, year published) or (Author et al., year published) (Turbeck 419). • In sentence citations are like so: Author (YEAR) or Author et al. (YEAR) (Turbeck 419). Below is a full citation for Turbeck’s “Scientific Writing Made Easy,” and this (Turbeck 2016) is a proper in-text citation for most of the science disciplines. Turbeck, Sheela P., et al. “Scientific Writing Made Easy: A Step-By-Step Guide to Undergraduate Writing in the Biological Sciences.” Bulletin of the Ecological Society of America, vol 97, no. 4, Oct 2016, pp. 417-426. EBSCOhost. (add all the authors) Helpful science indexes available to students: • A Short Guide to Writing about Biology, page 24, provides a detailed list of other databases for more specialized topics (many databases are available through our school). • Science Citation Index. • Biological Abstracts. • BIOSIS Previews and Basic IOSIS. All of these sources are accessible on ISI Web of Knowledge.
306
The Manhattan Scientist, Series B, Volume 5 (2018)
McHale
Abstract The abstract is written last thing you write and is a summary of the entire manuscript. Your abstract will reference your research results. Additionally, it will discuss summary data, including specific numbers will be included. Without including results, the abstract has no proof to it. If your abstract does not contain summary data, the abstract is weak. The abstract can borrow sentences from your manuscript. Thus, you are not implementing new ideas, but reinterring what you have already said (Evans 7). Since these are your own words and thoughts, you are allowed to copy them; it is not plagiarism. An abstract may answer the following questions: • Why did you do this experiment? • What problems does your manuscript address? • How were problems approached? • What are some critical results found? • What possible conclusions can be drawn? (Pechenik 202) The abstract will be the shortest section of your manuscript, usually taking up about 10 sentences. Here is a general breakdown of the sentence by sentence structural make-up of the abstract: • In 1-3 sentences, describe problems and give any needed background information. • In 1-2 sentences, describe each of your individual hypothesis in their own respective sentences. • In 1-3 sentences, describe any experimental approaches or methods. • In 1-3 sentences, describe what you found. • In 1-2 sentences, wrap up with definitive conclusions (Evans 7). Suggestions: • Use the word “and” sparingly as and connects two ideas that frequently are not meant to be conflated together (Pechenik 90). • Genus and species names will look like Homo sapiens. The first letter of each genus is capitalized while, the first letter of a species name is lowercase. Do not forget to underline or italicize scientific names (Evans 9). • In each individual section of your manuscript, the first mention of a genus will be spelled out for readers. Within individual sections of your manuscript, further mentions of genus names can be written as a capital letter followed by a period (Evans 9). • When using an acronym for the first time, in a manuscript, introduce the acronym if it is uncommon. “DNA” is ok, whereas “5-ht” instead of writing out “serotonin” is not, and will be put into quotations next to the phrase as a reference, the first time the phrase is used. For more obscure acronyms write “herein called. . . ” (Pechenik 91). If a reference’s abbreviations are only used a couple of times, the abbreviation may not be needed and should simply be omitted at the discretion of the author. • Just because abbreviations, such as “bc” in place of “because”, are acceptable in emails, texting, and other forms of writing does not mean that abbreviations are acceptable here; this is formal science writing and abbreviations are not warranted (Evans 10). • Page numbers are mandatory and are not to be left out (Evans 10).
McHale
The Manhattan Scientist, Series B, Volume 5 (2018)
307
Best practices for Formal Writing Writing style for science publications Using parallel structure, when introducing new ideas, is important as parallelism helps readers to grasp meanings easier. Sentences will model each other and follow a linear pattern to create a sense of flow within paragraphs (Greene 60). Persuasive writing thrives when the argument progresses from least important information to most important information. This is why authors place the most important information at the end of a sentence; readers remember the critical information best. This applies with ends of sentences, paragraphs, or even sections of paragraphs (Greene 78). Scientists will often write a paragraph in which their writing presents a problem and then utilize the rest of the paragraph to solve the problem (Greene 79). Linking and transitioning is important, within manuscripts, as readers should never have to backup to understand what they are reading. To be successful in linking topics, a bit of summarizing is needed, to create a vast synthesis of information about the topic (Pechenik 9). To make a clear draft, topics must be linked together with linking words such as “thus,” “therefore,” and “in addition” (Pechenik 8). Science writing is informative, not impressive. To effectively inform you must define all specialized terminology (Pechenik 6). Do not tell a reader something is interesting; show them how it is interesting (Pechenik 85). Writing with exact language is important; unclear word choice can lead to ambiguity in a sentence and lack of overall clarity regarding the manuscript and topic (Pechenik 8). You must support your statements of facts and opinions with evidence from your own data, such as a table, figure, or even statistical evidence (Pechenik 6-7). Understand the differences between facts, opinions, and possibilities within your research. You may form opinions about your data while you are investigating, but these are not facts and must not be represented as such throughout your manuscript (Pechenik 7). If you are unsure of writing your opinion in a sentence, do not just examine the sentence, you might have to reexamine your opinion to understand why you feel that way. If you have convinced yourself of something, you might have to convince your reader, as well, as your paper cannot contain opinions, only ideas that you are attempting to prove (Pechenik 85). Words to avoid: “they, it, their, these, this, them” (Pechenik 9 and 88) All such words are vague and can create ambiguity within sentences of your manuscript (Pechenik 9 and 88). Words like these make readers feel the need to back up and reread, which is to be avoided at all costs. Clarity is not the only factor; being concise is also important. Removing words that sound nice is important, so your reader can get through and digest the meaning of your manuscript more easily. Grammar Active voice requires fewer words and makes clear who or what is doing the action. Therefore, active voice provides clarity in certain sections (Green 22). What you are talking about is the thing
308
The Manhattan Scientist, Series B, Volume 5 (2018)
McHale
doing the action. Understanding this will help you use the active voice (Pechenik 97). Passive voice gives the writer the opportunity to describe how a task was done, but not who did the task (Greene 26). Thus, the passive voice gives the writer the ability to negate telling who the one doing the action was. Be cautious when implying passive voice (Pechenik 96). Use weak verbs, or non-action verbs, sparingly (Pechenik 95). By removing weak sounding verbs, such as “indicate,” “suggest,” or vague words like “possibly,” and “maybe,” you can strengthen your overall argument (Greene 46). Connotation is also important when reflecting on word choice. Use indicative descriptive words such as: “clearly,” “undoubtedly,” “major,” and “necessary.” Descriptive words help to ensure solidarity and highlight key ideas and concepts (Greene 47). During the first draft, look for connections between sentences and paragraphs. While doing so may be challenging, creating connections is rewarding for overall style and flow of the writing. A semi-colon can link together two sentences with connecting ideas; used effectively, as in this sentence, the semicolon is a great way to connect two short sentences or break up a run-on sentence. A comma splice occurs when an author uses a comma to join two separate sentences. This can be fixed by adding a semicolon or using a transitional word with a comma, such as “however” or “but” (Pechenik 107). Commas will be used to offset formal species names when the formal species name follows the specific common name (Pechenik 107). Prepositions are to be used sparingly to keep writing concise. Remove long verbal phrases and replace them with single verbs for clarity (Pechenik 94). Introductory phrases, such as “additionally,” “therefore,” and “due to” can frequently be removed from your manuscript during its final revision stages (Pechenik 93). Frequently misused words: • “Between” refers to only two things, while “among” is in reference to more than two things (Pechenik 104). • “Effect” can mean a “result or outcome” in its noun form and “to bring about” in its verb form (Pechenik 105). • The verb “affect” means “to influence or to produce an effect upon” (Pechenik 106). • “I.e.” stands for “id est” in Latin and means “that is” or “that is to say” while “e.g.” stands for “exampli gratis” and means “for example” (Pechenik 106). Sentence structure Sentence structure is key, along with order, in a paragraph. Each sentence will be written to prepare readers for the “statement of intent,” or the meaning behind your research (Pechenik 198). Each sentence needs to have content that is relevant (Pechenik 82). Individual sentences will also follow a logical order. Positioning verbs and nouns together can help make sentences clearer (Greene 19). You will never use “it,” “they,” or any kind of non-descriptive word to describe anything here. Pronouns will be omitted.
McHale
The Manhattan Scientist, Series B, Volume 5 (2018)
309
Varying sentences structure helps to keep readers engaged, as then there are no short, choppy sentences, nor long, wordy sentences to decipher (Greene 63). Run-on sentences may seem like a good idea when you write them but, they often confuse readers. After leaving the manuscript for a while, and then attempt to understand the run-on sentence’s meaning. Try breaking run-on sentences into two sentences, so the information be distributed clearly. You will likely need to remove unnecessary words or phrases. Science writing is about being concise, while still getting all important information across. Common things to avoid: • • • •
Starting a sentence with “and,” “but,” or “because.” Having less than three sentences in a paragraph. Ending a sentence with a preposition. Splitting the infinitive (Pechenik 109).
Paragraph Structure Paragraphs in scientific manuscripts should be about 5-6 sentences. Writing paragraphs for science follows a repetitive style that eliminates ambiguity for readers (Writing Paragraphs for Science). Following a pattern can help create a sense of rhythm or flow within your writing. Each paragraph will open with a topic sentence that serves as an introduction to the rest of the paragraph (Greene 69). Before starting a paragraph, know what the topic sentence of said paragraph is going to be. This will be the first sentence unless a transition is needed. Organize each paragraph around this one idea and discuss the idea thoroughly throughout (Greene 52). After the topic sentence, you can then develop the issue by including examples or opinions (Greene 69). The last sentence will act to ensure that readers understood what they just read (Greene 70). By putting old information at the beginning of the paragraph and slowly transitioning to your new information, at the end, you can take readers on a guided journey of your thoughts and show how your piece connects to pre-existing science (Greene 52-53). Science attempts to demonstrate results through steps. Thus, it is essential that the information is written in steps, or in chronological order (Greene 75). If the information does not make sense to present chronologically, then do not do so. Instead, refocus your document so the language is logical and easily interpreted by readers (Greene 76). To be logical, try to present the information by moving from general to specific, so that you guide the audience through the necessary information (Greene 76). This is consistent with the funnel idea that was presented in the introduction and will help to successfully communicate new information to your readers. In addition to linking sentences, linking between paragraphs is key. Linking, or using transitions, will create a sense of homogeny within your manuscript and will cement ideas most effectively in readers’ brains. Transitions consist of using a sentence or phrase to link together two different ideas. The sense of homogeny that linking provides will allow readers to understand your manuscript easily, with little struggle to comprehend movements between ideas. Using transitions
310
The Manhattan Scientist, Series B, Volume 5 (2018)
McHale
will also eliminate the need to flip through your manuscript to reflect and understand the material presented. Font Size Font size for your manuscript will be traditional Times New Roman size 12. Ink color will be black. Variation is not acceptable within tables and figures. Title Create a title that is able to convey meaning of the information you present throughout your manuscript. Make your title revealing, so that readers can understand what they are getting themselves into. The title should also be engaging, to get readers excited about your topic, as long as the title accurately represents the subject of your manuscript. Symbol and Number Use Use numerical numbers rather than word form when talking about percentages, decimals, magnifications, and abbreviated units of measurement (Pechenik 185). Use ordinal numbers, “first,” “second,” “third,” up until 10, then move to 10th (Pechenik 186). Decimals will always begin with 0 if the number is a decimal smaller than 1.0 (Pechenik 186). Utilize scientific notation if the number is in relation to very big or very small numbers (Pechenik 186). Additionally, always follow numbers with the appropriate units (Pechenik 187). Use spaces effectively. For example, use a space in between units of measure and the number the unit is describing. Ratios will be expressed as “Y: Y.” Fractions should be avoided, as fractions do not tell readers what the ratio means. Only when the ratio has been thoroughly explained, and its meaning is clearly understood, is a ratio acceptable (Evans 10). Number of significant figures will be determined by the other numbers said number is being expressed alongside, much like in a typical lab report. Thus, all numbers in groups, such as for the data in charts, will have the same number of significant figures. Within the group, 205.33, 0.3453, and 23.043, the number of significant figures acceptable is only to the hundredth place due to 205.33. Generally, only three numbers will be expressed, thus for example, 1009 will be written as 1010 (Evans 14). Rounding numbers is easy for all numbers beside 5. You cannot round numbers up every time, or the overall values inflates and can skew data. Thus, the rule for 5 is to round to the nearest even integer. This means that, 1.25 is rounded to 1.2 and 1.75 is rounded to 1.8 (Evans 14). Acknowledgements The acknowledgements function as a way to thank those who have helped with your manuscript’s production, and its research, enough that you feel you need to thank them. This is a way to acknowledge others’ contributions to your project without having to justify authorship for their efforts. It is important to thank your funding here as well as those who provided technical help with your analysis, writing, and proofreading (Borja 11). Additionally, formula boxes are problematic for formatting, do not build in-line equations into formatted boxes.
McHale
The Manhattan Scientist, Series B, Volume 5 (2018)
311
Key Words Writers should think of keywords while they are going through their paper. Doing so will help to keep a running list of words that your paper centers around. Keywords will serve as an index for your paper, so your paper can be categorized accordingly. Avoid general words, words already in your title, or words that are the title of the journal (Borja 10). Follow existing design Science writing follows strict style guidelines and maintains its rigidity throughout writing. The style of writing science is not willy-nilly and does not call for poetic expression; instead, science sticks to its current design. Thus, when writing, do not deviate from the traditional style. Use this, and other reference books, to serve as guidelines for scientific style. Look at your first draft and understand its potential. Not all first drafts can become final drafts. Sometimes you must scrap an entire draft if your draft is not strong enough and start over. You must come up with the most effective means to represent your months of research, to get your document published.
Editing Drafts Drafts demonstrate gaps between what you write and what you did and/or want to convey about your research. Your job as a science writer and editor is to bridge any gaps. When editing, the first thing to be concerned with is content and organization. These will take the most time and will dictate how the rest of the editing process will go (Pechenik 110). To revise the content of your manuscript, think of the audience for whom this manuscript is being written and make your writing self-sufficient, to ensure clarity (Pechenik 85). When editing, put yourself in readers’ heads to abandon your own knowledge biases. Doing so will ensure that readers, who knows nothing beyond what you say, will be able to read this manuscript and understand all that you have written regarding your arguments and theses. There is a series of progressions that need to occur in any draft. Before any real editing can happen, you must have all parts of the draft together to see what is present and what needs to be added. Here are some questions you can ask yourself to ensure that your draft is complete: • • • • • •
Have you completed your abstract? Are items in the right order? Are sections too long or too short (based on the number of hypotheses)? Do sections fully cover everything you want to say? Are data and results representative of your final goal? Do the Tables and Figures represent your data? (Evans 1)
Once you have taken a break from your manuscript, come back to it and ensure it still says everything you wanted to say. Ask yourself about the writing to see if it is solid, thorough, and clear. A big help for editing is finding a new way to look at your manuscript. A new way of editing
312
The Manhattan Scientist, Series B, Volume 5 (2018)
McHale
could be printing your manuscript out to edit the hard copy or reading the entire piece out loud to catch any errors. Ask someone to read your manuscript, as a reader will be best able to identify what needs to be clarified. Remember you know the topic, readers do not. You can, and should, go back to make sure you presented all your information correctly and thoroughly. Save drafts multiple times, each time you make a significant edit create a new file so that you have multiple drafts of your manuscript saved, in case something happens. Try titling your drafts something short and clear, such as your last name, topic, and date; for example, “‘Jones-chl01 Dec”’ for last name Jones, chlorophyll experiment, and December 1 for reviewing and saving purposes (Evans 2). Second Draft A strong second draft of your manuscript will answer the following questions: • Why was the design function chosen for the questions being asked? • Why were the controls needed? • Are results conclusive? If not, add here why not. • Are conclusions reasonable? • How can the study be expanded? What does the study still not address about the greater overall question asked? (Pechenik 35) Here are some different ways to revise what you currently have written: • Reorganize ideas by making a list of ideas from each paragraph. Move paragraphs around so ideas flow in a different order. This can usually happen only when a first draft is complete, or when the draft has all been bulleted out. • Revise for content by removing unnecessary words, phrases, and sentences to ensure everything your manuscript says is necessary and not just there to be there. • Clarify each expression of scientific terminology. The more a reader has to guess at what you are saying, the less readers can appreciate your writing and arguments. Remove ambiguities as much as possible. • Revise for completeness of topic by trying to be more specific. • Revise for flow and transitions, to ensure that topics flow smoothly throughout. • Revise for grammar and spelling last, alongside double-checking precision of scientific terms and accuracy of ideas (Pechenik 78). To edit successfully, editing must be approached in stages. Otherwise, revisions and editing, in combination, are going to be too hard to get through thoroughly: • Ensure you address main issues and ideas that your study covers. Content is important. Remove any text that does not focus on your topic. • Ensure transitional flow, so as not to write jumpy, confusing paragraphs. • Reread to fine tune. • Edit sentence structure and word choice to articulate your research. • Edit grammar and spelling last because they will change while the content is being revised.
McHale
The Manhattan Scientist, Series B, Volume 5 (2018)
313
Final Draft and Proofreading Proofreading, the final stage, is important as every error detracts from your credibility and readers’ overall reading experience. Errors demonstrate that you have little care or consideration for your document. Much of manuscript publishing, and grading, has to do with subjectivity, which can swing against you with a hard-to-read manuscript (Pechenik 12). Due to the critical nature of science manuscript writing, it is advisable to go sentence-by-sentence and paragraph-by-paragraph to edit your manuscript. Just because we live in the era of computers does not mean that a computer can think for you. This especially means that a computer cannot edit your document for you. While programs like Microsoft Word employ a spell checker, editing programs cannot catch all editing and proofreading mistakes. Thus, do not rely on anything for editing beside yourself. Spellchecker can be one of your first steps when editing, but certainly not the only one (Pechenik 15). Many times, editing software will miss things or even spell the wrong word for you. Thus, rereading is crucial to ensure grammatical accuracy. Do not insert figures and tables into your manuscript unless instructed to do so (Pechenik 204). If this is a manuscript that will be submitted to a journal, ensure you follow their guidelines and read up on published manuscripts to see how your manuscript should look. Final Order of Manuscript This is the order that the final manuscript will follow: • • • • • • • • • • • •
Title page Abstract Introduction Materials and Methods Results Discussion (this may include a formal conclusion when instructed) Acknowledgements References Appendix Tables (in Arabic numerical order) Figure legends (in Arabic numerical order) Figures (in Arabic numerical order)
Acknowledgements This manuscript was made possible through the Manhattan College School of Science Summer Research Scholars, sponsored by the Linda and Dennis Fenton ’73 and Catherine and Robert Fenton Biology Research Fund. The author would like to acknowledge Dr. Lance Evans, Catherine and Robert Fenton Endowed Chair of Biology, for his tireless guidance throughout the writing
314
The Manhattan Scientist, Series B, Volume 5 (2018)
McHale
and editing process. Thanks are extended to Dr. Ashley Cross and RikkiLynn Shields for their extensive editing efforts. Additionally, thanks are given to Dr. Constantine E. Theodosiou, Dean of Science, for his constant support of this endeavor.
References Bertoli, Mia, and Lance Evans. “Bark Coverages Predict Death of Saguaro Cacti (Carnegiea Gigantea).” Borja, Angel. “11 Steps to Structuring a Science Paper Editors Will Take Seriously.” 1st Edition, Butterworth-Heinemann, 24 June 2014. Evans, Lance. Appendix A. A Laboratory Manual for Plant Biology. Manhattan College, 2006. Finkelstein, Leo. Pocket Book of Technical Writing For Engineers and Scientists. McGraw-HillL, 2007. Greene, Anne E. Writing Science in Plain English. Univ. of Chicago Press, 2013. Pechenik, Jan A. A Short Guide to Writing about Biology. 9th ed., Pearson/Prentice Hall, 2015. Rodrigues, Velany. “Tips on Effective Use of Tables and Figures in Research Papers.” Editage Insights. Editage Insights, 4 Nov. 2013. Turbeck, Sheela P., et al. “Scientific Writing Made Easy: A Step-By-Step Guide to Undergraduate Writing in the Biological Sciences.” Bulletin of the Ecological Society of America, vol 97, no. 4, Oct 2016, pp. 417-426. EBSCOhost.
APPENDIX Style(s) of References: • Page 83 of Writing Science in Plain English offers a list of transitional words. • Pages 237-238 of Pocket Book of Technical Writing for Engineers and Scientists is a quick guide to punctuation. • A Short Guide to Writing About Biology’s best pages to reference: page 150 summary paragraphs about components of a lab report, page 11 provides a nine-point table with nuances for science writing, pages 110-112 contain information on how to become an effective reviewer of scientific publications, page 117 is a summary checklist for editing, and page 205 is a final draft checklist. Sources to Consult for Formatting the Paper: • AMA [American Medical Association] Manual of Style • Fowler’s English Usage • Gower’s Plain Words • Scientific Style and Format • The ACS [American Chemical Society] Style Guide • The Chicago Manual of Style
4513 Manhattan College Parkway, Riverdale, NY 10471 manhattan.edu