Plant genomic and cytogenetic databases methods in molecular biology 2703 sonia garcia ebook PDF do

Page 1


Plant Genomic and Cytogenetic Databases Methods in Molecular Biology 2703 Sonia Garcia

Visit to download the full and correct content document: https://textbookfull.com/product/plant-genomic-and-cytogenetic-databases-methods-i n-molecular-biology-2703-sonia-garcia/

More products digital (pdf, epub, mobi) instant download maybe you interests ...

Plant Proteomics Methods and Protocols Methods in Molecular Biology 2139 3rd Edition Jesus V. JorrinNovo

https://textbookfull.com/product/plant-proteomics-methods-andprotocols-methods-in-molecular-biology-2139-3rd-edition-jesus-vjorrin-novo/

Eukaryotic Genomic Databases Martin Kollmar

https://textbookfull.com/product/eukaryotic-genomic-databasesmartin-kollmar/

Plant Genetics and Molecular Biology Rajeev K. Varshney

https://textbookfull.com/product/plant-genetics-and-molecularbiology-rajeev-k-varshney/

Plant Micronutrient Use Efficiency: Molecular and Genomic Perspectives in Crop Plants 1st Edition

https://textbookfull.com/product/plant-micronutrient-useefficiency-molecular-and-genomic-perspectives-in-crop-plants-1stedition-mohammad-anwar-hossain/

Systems Biology Methods in Molecular Biology 2745

https://textbookfull.com/product/systems-biology-methods-inmolecular-biology-2745-mariano-bizzarri/

Molecular Approaches in Plant Biology and Environmental Challenges Sudhir P. Singh

https://textbookfull.com/product/molecular-approaches-in-plantbiology-and-environmental-challenges-sudhir-p-singh/

Cancer Cytogenetics Methods and Protocols Methods in Molecular Biology 1541 Desconocido

https://textbookfull.com/product/cancer-cytogenetics-methods-andprotocols-methods-in-molecular-biology-1541-desconocido/

Neurobiology Methods and Protocols Methods in Molecular Biology 2746 Sebastian Dworkin

https://textbookfull.com/product/neurobiology-methods-andprotocols-methods-in-molecular-biology-2746-sebastian-dworkin/

Xylem Methods and Protocols Methods in Molecular Biology 2722 Javier Agusti

https://textbookfull.com/product/xylem-methods-and-protocolsmethods-in-molecular-biology-2722-javier-agusti/

Methods in Molecular Biology 2703

Plant Genomic and Cytogenetic Databases

M

School of Life and Medical Sciences

University of Hertfordshire Hatfield, Hertfordshire, UK

For further volumes: http://www.springer.com/series/7651

For over 35 years, biological scientists have come to rely on the research protocols and methodologies in the critically acclaimed Methods in Molecular Biology series. The series was the first to introduce the step-by-step protocols approach that has become the standard in all biomedical protocol publishing. Each protocol is provided in readily-reproducible step-by step fashion, opening with an introductory overview, a list of the materials and reagents needed to complete the experiment, and followed by a detailed procedure that is supported with a helpful notes section offering tips and tricks of the trade as well as troubleshooting advice. These hallmark features were introduced by series editor Dr. John Walker and constitute the key ingredient in each and every volume of the Methods in Molecular Biology series. Tested and trusted, comprehensive and reliable, all protocols from the series are indexed in PubMed.

Plant Genomic and Cytogenetic Databases

Botanical Institute of Barcelona, IBB (CSIC-Barcelona City Council), Barcelona, Spain

Editors

So ` nia Garcia

Botanical Institute of Barcelona

IBB (CSIC-Barcelona City Council)

Barcelona, Spain

Neus Nualart

Botanical Institute of Barcelona

IBB (CSIC-Barcelona City Council)

Barcelona, Spain

ISSN 1064-3745ISSN 1940-6029 (electronic)

Methods in Molecular Biology

ISBN 978-1-0716-3388-5ISBN 978-1-0716-3389-2 (eBook)

https://doi.org/10.1007/978-1-0716-3389-2

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023

This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Humana imprint is published by the registered company Springer Science+Business Media, LLC, part of Springer Nature.

The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.

Paper in this product is recyclable.

Preface

There is a myriad of online plant genomics resources, mainly due to the increasing availability of sequencing data related to technical advances in the field of genome sequencing. The new tools, allowing fast sequencing solutions at decreasing costs, have prompted the surge of multiple websites offering integrated omics information. Together with sequence data, information at the more structural level, related to chromosomes, karyotypes, or genome size, is also relevant to our understanding of plant genome evolution. Yet, despite its utility, such cytogenetic information seems to attract less interest or is less connected with the overall amount of genomic data. With this in mind, this volume of Methods in Molecular Biology series offers an updated catalogue of online plant genomics and cytogenetic resources, which may be of interest to plant evolutionary biologists as well as to plant breeders. The chapters describe database targets, contents, and examples of uses, aiming to provide a practical guide to the online richness of plant genomics and cytogenetic resources. This compilation is only about databases and does not include analytic resources, and up to our knowledge, all reported databases are periodically updated and regularly maintained.

The book is organized into two parts, the first one devoted to plant genomic databases and the second to databases with a focus on cytogenetics. Part I contains eight chapters. In Chapter 1, Daniel Arend and colleagues describe the plant phenomics and genomics research data repository, discussing how advances in methodology and technology have helped in the study of plant genomics and phenomics, and how this has helped with issues like food security, biodiversity, and climate change. This data repository makes comprehensive plant research data publicly available to the global research community. Chapter 2,by Philipp E. Bayer and David Edwards, deals with pangenome graphs as a data structure to represent genome diversity. Although pangenome graphs have been created for some species, their visualization is challenging because of their size and complexity; here a workflow is presented, through the use of the tool Wheat Panache, to search and visualize wheat pangenome graphs. The next four chapters present databases related to transposable elements (TEs), which are commonly used as evolutionary markers in genetic, genomic, and cytogenetic approaches. In Chapter 3, Simon Gaviria-Orrego and colleagues introduce us to the InpactorDB, a plant LTR retrotransposon reference library. This database, available online, addresses the lack of freely available full-length and classified LTR retrotransposon reference sequences, by collecting almost 70,000 elements that can be used to identify and classify LTR-RTs. Chapter 4, by Asmaa H. Hassan and coauthors, presents TEMM, another database for transposon element-based molecular markers in plants. In Chapter 5, by Dog ˘ a Eskier and colleagues, we are introduced to PlanTEenrichment, an online database comprising 11 plant genomes to analyze stress-associated TEs. It provides taxonomic information on TEs and it can inform on the enrichment values of the repeats; besides, this database can be used by researchers with little or no experience in computational analyses to quickly identify TEs of interest. Chapter 6, by Morad M. Mokhtar and colleagues, describes the updated database CicerSpTEdb2.0, which now includes more accurate intact LTR-RT elements, with annotation of internal domains belonging to species of the genus Cicer (chickpeas), providing a valuable resource for comparative genomics of the crop Cicer arietinum and closely related species. The following Chapter 7, by Marie C. Henniges

and Ilia J. Leitch, describes BIFloraExplorer, a taxonomic, genetic, and ecological data resource for British and Irish vascular plants, providing a clean, curated, and taxonomically resolved dataset that enables the study of natural ecological and evolutionary patterns and processes within the vascular flora of Britain and Ireland. Part I is closed with Chapter ,by Noe Fernandez-Pozo, presenting PEATmoss, a gene expression atlas for bryophytes, with abundant data on Physcomitrium patens and expanding to include information from Anthoceros agrestis and Marchantia polymorpha. PEATmoss offers visualization methods and tools for data download and has added features including, among others, co-expression network visualization and the ability to download replicate data. 8

Part II contains ten additional chapters related to cytogenetics and chromosome-related databases. Chapter 9, by Marie C. Henniges and colleagues, deals with the plant DNA C-values database, the most trusted and widely used resource for accessing plant genome size data, along with information like ploidy level and chromosome number, and offering various query and filtering options. Updated six times since its inception in 2001, the most recent release includes data for ca. 12,300 species across various land plant and algal lineages. In Chapter 10, another well-known database in the plant cytogenetics community is presented by Anna Rice and Itay Mayrose: the Chromosome Counts Database (CCDB). Created in 2015, the CCDB is the central online resource for plant biologists interested in determining and documenting chromosome numbers of extant taxa. More specific databases, regarding a restricted geographical region, taxonomic group, or chromosome type or feature, are further described. In Chapter 11, Joan Simon and colleagues present CromoCat, an online database providing karyological information on the vascular plants of the Catalan Countries (Spain). It includes over 68,000 records belonging to more than 5000 taxa, and it has been developed for over 25 years. Chapter 12, by Kuniaki Watanabe and John Semple, presents the Index to Plant Chromosome Numbers in Asteraceae, a resource that has been useful for synantherologists for decades, giving information on chromosome numbers, ploidy levels, and genome sizes for species of family Asteraceae, one of the largest of angiosperms. The authors also take the occasion to review the processes of dysploidy, polyploidy, and hybridization having frequently occurred throughout the family’s evolutionary history. In Chapter 13, Maria Bosch and coauthors present the newly updated (and newly online) version of the Delphineae Chromosome Database (DCDB), a resource with karyological information on this tribe of family Ranunculaceae containing species such as aconites or larkspurs. The DCDB provides the most accurate and complete information on chromosome numbers, ploidy levels, and other karyological data and is intended to be useful for research in cytotaxonomy, systematics, and evolution. Chapter 14, by Pedro Jara-Seguel and Jonathan Urrutia-Estrada, reviews and updates the Chilean Plants Cytogenetic Database, an online resource providing cytogenetic information for native and invasive plant species in Chile. In Chapter 15, Thomas Gregor and colleagues explain to us the online data por tal “Chromosome numbers of the flora of Germany,” another resource with information on chromosome counts, ploidy levels, and flow cytometric data (genome size), and with over 14,000 records. Chapter 16, by Mariela A. Sader and coauthors, reviews the available South American databases on chromosome numbers, as well as the contents of those databases containing cytogenetic information on South American species. Chapter 17, by Marı´a Luisa Gutie ´ rrez and coauthors, reviews and updates B-chrom, a database on B-chromosomes for plants, animals, and fungi, which now provides information for ca. 3000 species. Similarly, Chapter 18, by Roi Rodrı´guez-Gonza ´ lez and collaborators, reviews and updates the Plant Ribosomal DNA database, presenting the results of the fourth update to the resource, contributing information on rDNA loci number, position, and

organization, besides other related data such as genome size or telomere type, for more than 2700 plant species. Prefacevii

As sequencing technologies continue to improve and more genetic, genomic, and chromosomal information is generated, it is becoming increasingly important to have resources like the ones described in the book you are holding. This collection of databases can help make this wealth of information more accessible and usable for plant science researchers.

Catalonia, SpainSo

Barcelona,
` nia Garcia Neus Nualart

PART IGENOMIC DATABASES

1 The Plant Phenomics and Genomics Research Data Repository: An On-Premise Approach for FAIR-Compliant

Daniel Arend, Uwe Scholz, and Matthias Lange

2 Investigating Pangenome Graphs Using Wheat Panache

Philipp E. Bayer and David Edwards

3 InpactorDB: A Plant LTR Retrotransposon Reference Library

Simon Orozco-Arias, Simon Gaviria-Orrego, Reinel Tabares-Soto, Gustavo Isaza, and Romain Guyot

4 TEMM: A Curated Data Resource for Transposon Element-Based Molecular Markers in Plants .

Asmaa H. Hassan, Morad M. Mokhtar, and Achraf El Allali

5 PlanTEnrichment: A How-to Guide on Rapid Identification of Transposable Elements Associated with Regions of Interest in Select Plant Genomes 59 Dog ˘ a Eskier, Alirıza Arıbas¸, and Go¨khan Karaku ¨ lah

6 CicerSpTEdb2.0: An Upgrade of Cicer Species Transposable Elements Database 71 Morad M. Mokhtar, Ahmed S. Fouad, Haytham M. Abd-Elhalim, and Achraf El Allali

7 “BIFloraExplorer”: A Taxonomic, Genetic, and Ecological Data Resource for the Vascular Plants of Britain and Ireland 83 Marie C. Henniges, Andrew R. Leitch, and Ilia J. Leitch

8 PEATmoss: A Gene Expression Atlas for Bryophytes 91 Noe Fernandez-Pozo

PART II CYTOGENETICS AND CHROMOSOME-RELATED DATABASES

9 The Plant DNA C-Values Database: A One-Stop Shop for Plant Genome Size Data 111 Marie C. Henniges, Emmeline Johnston, Jaume Pellicer, Oriane Hidalgo, Michael D. Bennett, and Ilia J. Leitch

10 The Chromosome Counts Database (CCDB) .

Anna Rice and Itay Mayrose

11 CromoCat: Chromosome Database of the Vascular Flora of the Catalan Countries 25 years 131

Joan Simon, Maria Bosch, and Ce`sar Blanche´

12 An Overview to the Index to Chromosome Numbers in Asteraceae Database: Revisiting Base Chromosome Numbers, Polyploidy, Descending Dysploidy, and Hybridization 161

John C. Semple and Kuniaki Watanabe

13 DCDB: Chromosome Database of Tribe Delphinieae (Ranunculaceae): Structure, Exploitation, and Recent Development

Maria Bosch, Jordi Lopez-Pujol, Ce`sar Blanche´, and Joan Simon

14 Chilean Plants Cytogenetic Database: An Online Resource for Embryophytes of the Southern Cone

Pedro Jara-Seguel and Jonathan Urrutia-Estrada

173

193

15 The Data Portal “Chromosome Numbers of the Flora of Germany”: Progress After Five Years, Recent Developments, and Future Strategies 201 Thomas Gregor, Stefan Dressler, Sebastian Klemm, Christiane M. Ritz, Marco Schmidt, Karsten Wesche, Jens Wesenberg, Georg Zizka, and Juraj Paule

16 South American Plant Chromosome Numbers Databases: The Information We Have and the Information We Lack on the Most Plant-Diverse Continent . .

Mariela A. Sader, Lucas A. Costa, Gustavo Souza, Juan D. Urdampilleta, Joan Simon, and Magdalena Vaio

17 First Update to B-Chrom: A Database on B-Chromosomes 227 Marı ´ a Luisa Gutie´rrez, Roi Rodrı ´ guez-Gonza ´ lez, Ine´s Fuentes, Francisco Ga ´ lvez-Prada, Ales ˇ Kovar ˇ ı ´ k, and So ` nia Garcia

18 Release 4.0 of the Plant rDNA Database: A Database on Plant Ribosomal DNA Loci Number, Their Position, and Organization: An Information Source for Comparative Cytogenetics 237 Roi Rodrı ´ guez-Gonza ´ lez, Marı ´ a Luisa Gutie´rrez, Ine´s Fuentes, Francisco Ga ´ lvez-Prada, Jana Sochorova ´ , Ales ˇ Kovar ˇ ı ´ k, and So ` nia Garcia

Index

Contributors

HAYTHAM M. ABD-ELHALIM • Agricultural Genetic Engineering Research Institute, Agricultural Research Center, Giza, Egypt

DANIEL AREND • Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, OT Gatersleben, Germany

ALIRIZA ARIBAS¸ • Bioinformatics Platform, Izmir Biomedicine and Genome Center (IBG), I ˙ nciraltı, I ˙ zmir, Turkey

PHILIPP E. BAYER • Centre for Applied Bioinformatics and School of Biological Sciences, The University of Western Australia, Perth, WA, Australia

MICHAEL D. BENNETT • Royal Botanic Gardens, Kew, Richmond, UK

CE ` SAR BLANCHE ´ • BioC (GReB, IRBio) – Laboratori de Bota ` nica, Facultat de Farma ` cia i Cie`ncies de l’Alimentacio, Universitat de Barcelona, Barcelona, Catalonia, Spain

MARIA BOSCH • BioC (GReB, IRBio) – Laboratori de Bota ` nica, Facultat de Farma ` cia i Cie`ncies de l’Alimentacio, Universitat de Barcelona, Barcelona, Catalonia, Spain

LUCAS A. COSTA • Laboratorio de Citogene´tica Vegetal, Departamento de Bota ˆ nica, Universidade Federal de Pernambuco, Recife, Brazil

STEFAN DRESSLER • Department of Botany and Molecular Evolution, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt am Main, Germany

DAVID EDWARDS • Centre for Applied Bioinformatics and School of Biological Sciences, The University of Western Australia, Perth, WA, Australia

ACHRAF EL ALLALI • African Genome Center, Mohammed VI Polytechnic University, Ben Guerir, Morocco

DOG ˘ A ESKIER • Izmir International Biomedicine and Genome Institute, Dokuz Eylu¨l University, I ˙ nciraltı, I ˙ zmir, Turkey; Bioinformatics Platform, I ˙ zmir Biomedicine and Genome Center (IBG), Inciraltı, Izmir, Turkey

NOE FERNANDEZ-POZO • Plant Cell Biology, Department of Biology, University of Marburg, Marburg, Germany; Institute for Mediterranean and Subtropical Horticulture (IHSMCSIC-UMA), Algarrobo-Costa, Ma ´ laga, Spain

AHMED S. FOUAD • Botany and Microbiology Department, Faculty of Science, Cairo University, Giza, Egypt

INE ´ S FUENTES • Institut Bota ` nic de Barcelona, IBB (CSIC-Ajuntament de Barcelona), Barcelona, Catalonia, Spain; Parc de Recerca Biome`dica de Barcelona (PRBB), Barcelona, Catalonia, Spain

FRANCISCO GA ´ LVEZ-PRADA • Bioscripts-Centro de Investigacion y Desarrollo de Recursos

Cientı ´ ficos, Sevilla, Spain

SO ` NIA GARCIA • Institut Bota ` nic de Barcelona, IBB (CSIC-Ajuntament de Barcelona), Barcelona, Catalonia, Spain

SIMON GAVIRIA-ORREGO • Department of Computer Science, Universidad Autonoma de Manizales, Manizales, Caldas, Colombia

THOMAS GREGOR • Department of Botany and Molecular Evolution, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt am Main, Germany

MARI ´ A LUISA GUTIE ´ RREZ • Institut Bota ` nic de Barcelona, IBB (CSIC-Ajuntament de Barcelona), Barcelona, Catalonia, Spain

ROMAIN GUYOT • Department of Electronics and Automation, Universidad Autonoma de Manizales, Manizales, Caldas, Colombia; Institut de Recherche pour le De´veloppement, CIRAD, University of Montpellier, Montpellier, France

ASMAA H. HASSAN • African Genome Center, Mohammed VI Polytechnic University, Ben Guerir, Morocco

MARIE C. HENNIGES • Royal Botanic Gardens, Kew, Richmond, UK; School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK

ORIANE HIDALGO • Royal Botanic Gardens, Kew, Richmond, UK; Institut Bota ` nic de Barcelona (IBB, CSIC-Ajuntament de Barcelona), Barcelona, Spain

GUSTAVO ISAZA • Department of Systems and Informatics, Universidad de Caldas, Manizales, Caldas, Colombia

PEDRO JARA-SEGUEL • Departamento de Ciencias Biologicas y Quı ´ micas & Nu ´ cleo de Estudios Ambientales, Facultad de Recursos Naturales, Universidad Catolica de Temuco, Temuco, Chile

EMMELINE JOHNSTON • Royal Botanic Gardens, Kew, Richmond, UK

GO ¨ KHAN KARAKU ¨ LAH • Izmir International Biomedicine and Genome Institute, Dokuz Eylu¨l University, I ˙ nciraltı, I ˙ zmir, Turkey; Bioinformatics Platform, I ˙ zmir Biomedicine and Genome Center (IBG), I ˙ nciraltı, I ˙ zmir, Turkey

SEBASTIAN KLEMM • Department of Botany, Senckenberg Museum of Natural History Go¨rlitz, Go¨rlitz, Germany

ALES ˇ KOVAR ˇ I ´ K • Institute of Biophysics, Academy of Sciences of the Czech Republic, Brno, Czech Republic

MATTHIAS LANGE • Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, OT Gatersleben, Germany

ANDREW R. LEITCH • School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK

ILIA J. LEITCH • Royal Botanic Gardens, Kew, Richmond, UK

JORDI LO ´ PEZ-PUJOL • Institut Bota ` nic de Barcelona (IBB, CSIC-Ajuntament de Barcelona), Barcelona, Catalonia, Spain

ITAY MAYROSE • School of Plant Sciences and Food Security, Tel Aviv University, Tel Aviv, Israel

MORAD M. MOKHTAR • African Genome Center, Mohammed VI Polytechnic University, Ben Guerir, Morocco

SIMON OROZCO-ARIAS • Department of Computer Science, Universidad Autonoma de Manizales, Manizales, Caldas, Colombia; Department of Systems and Informatics, Universidad de Caldas, Manizales, Caldas, Colombia

JURAJ PAULE • Department of Botany and Molecular Evolution, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt am Main, Germany; Botanischer Garten und Botanisches Museum Berlin, Freie Universit € at Berlin, Berlin, Germany

JAUME PELLICER • Royal Botanic Gardens, Kew, Richmond, UK; Institut Bota ` nic de Barcelona (IBB, CSIC-Ajuntament de Barcelona), Barcelona, Spain

ANNA RICE • School of Plant Sciences and Food Security, Tel Aviv University, Tel Aviv, Israel

CHRISTIANE M. RITZ • Department of Botany, Senckenberg Museum of Natural History Go¨rlitz, Go¨rlitz, Germany

ROI RODRI ´ GUEZ-GONZA ´ LEZ • Institut Bota ` nic de Barcelona, IBB (CSIC-Ajuntament de Barcelona), Barcelona, Catalonia, Spain

MARIELA A. SADER • Instituto Multidisciplinario de Biologı ´ a Vegetal (IMBIV), Universidad de Cordoba – CONICET, Cordoba, Argentina

MARCO SCHMIDT • Palmengarten der Stadt Frankfurt am Main, Frankfurt am Main, Germany

UWE SCHOLZ • Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, OT Gatersleben, Germany

JOHN C. SEMPLE • Department of Biology, University of Waterloo, Waterloo, ON, Canada

JOAN SIMON • BioC (GReB, IRBio) – Laboratori de Bota ` nica, Facultat de Farma ` cia i Cie`ncies de l’Alimentacio, Universitat de Barcelona, Barcelona, Catalonia, Spain

JANA SOCHOROVA ´ • Institute of Biophysics, Academy of Sciences of the Czech Republic, Brno, Czech Republic

GUSTAVO SOUZA • Laboratorio de Citogene´tica Vegetal, Departamento de Bota ˆ nica, Universidade Federal de Pernambuco, Recife, Brazil

REINEL TABARES-SOTO • Department of Electronics and Automation, Universidad Autonoma de Manizales, Manizales, Caldas, Colombia; Department of Systems and Informatics, Universidad de Caldas, Manizales, Caldas, Colombia

JUAN D. URDAMPILLETA • Instituto Multidisciplinario de Biologı ´ a Vegetal (IMBIV), Universidad de Cordoba – CONICET, Cordoba, Argentina

JONATHAN URRUTIA-ESTRADA • Laboratorio de Invasiones Biologicas, Facultad de Ciencias Forestales, Universidad de Concepcion, Concepcion, Chile; Instituto de Ecologı ´ ay Biodiversidad, IEB, Concepcion, Chile

MAGDALENA VAIO • Laboratorio de Evolucion y Domesticacion de las Plantas, Departamento de Biologı ´ a Vegetal, Facultad de Agronomı ´ a, Universidad de la Repu ´ blica, Montevideo, Uruguay

KUNIAKI WATANABE • Department of Biology, Graduate School of Science, Kobe University, Kobe, Japan

KARSTEN WESCHE • Department of Botany, Senckenberg Museum of Natural History Go¨rlitz, Go¨rlitz, Germany; German Centre for Integrative Biodiversity Research Halle-JenaLeipzig, Leipzig, Germany

JENS WESENBERG • Department of Botany, Senckenberg Museum of Natural History Go¨rlitz, Go¨rlitz, Germany

GEORG ZIZKA • Department of Botany and Molecular Evolution, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt am Main, Germany

Part I

Genomic Databases

Chapter 1

The Plant Phenomics and Genomics Research Data Repository: An On-Premise Approach for FAIR-Compliant Data Acquisition

Abstract

The FAIR data principle as a commitment to support long-term research data management is widely accepted in the scientific community. However, although many established infrastructures provide comprehensive and long-term stable services and platforms, a large quantity of research data is still hidden. Currently, high-throughput plant genomics and phenomics technologies are producing research data in abundance, the storage of which is not covered by established core databases. This concerns the data volume, for example, time series of images or high-resolution hyperspectral data; the quality of data formatting and annotation, e.g., with regard to structure and annotation specifications of core databases; uncovered data domains; or organizational constraints prohibiting primary data storage outside institutional boundaries. To share these potentially dark data in a FAIR way and master these challenges the ELIXIR Germany/de.NBI service Plant Genomic and Phenomics Research Data Repository (PGP) implements an on-premise approach, which allows research data to be kept in place and wrapped in FAIR-aware software infrastructure. In this chapter, the e!DAL infrastructure software and the PGP repository are presented as best practice on how to easily setup FAIR-compliant and intuitive research data services.

Key words Research data management, FAIR principles, Digital object identifier, Plant genomics and phenomics

1 Introduction

Plant genomics and phenomics studies benefit from methodological and technological advances to address increasing demands on foodsecurity, protect biodiversity and climate change [1].However, besides the technological advances, a key aspect is the access to a digital data-sphere that represents the whole of plant genetic resources as a diversity treasure. Therefore, it is essential to use established genomic and phenomic data dissemination standards and publish subsequent research data in sustainable and openaccess platforms.

Sonia Garcia and Neus Nualart (eds.), Plant Genomic and Cytogenetic Databases, Methods in Molecular Biology, vol. 2703, https://doi.org/10.1007/978-1-0716-3389-2_1, © The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023

One essential cornerstone is the consequent application of FAIR principles alongside the entire research data life cycle [2–4]. Particularly obvious is a potential gap between data acquisition and data management in high-throughput technologies, like in the area of plant genomics and phenomics, which continuously produce a large amount of data. A particularly critical breaking point is the storage of such research data, which is not covered by community-established repositories, respectively, core databases. Due to this, there is a high risk that many valuable datasets remain unpublished. This in turn causes a big gap between created research results and publicly accessible research datasets [5].

In this context, an alternative approach is to equip the data collection facility directly with the necessary infrastructures on site. This applies in particular to the possibility of publishing locally stored data, that is, on-premise, according to FAIR criteria. The following chapter describes the design and the concept of the Plant Phenomics & Genomics Research Data Repository, following an alternative approach providing an on-premise architecture to overcome the mentioned challenges and making comprehensive plant research data publicly available for the worldwide research community.

2 Background

2.1 Current Status of Data Publication Infrastructures

The Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben has one of the world’s largest ex situ germplasm collections and, therefore, a large experience of over two decades of data management of plant-related research data. Based on this experience and demands on a sustainable data life cycle, in 2012, the development of the e!DAL software infrastructure [6] was initiated with the concept of an on-premise data storage infrastructure as a service to expose DOI (digital object identifiers) assigned datasets as a base service of the IPK as a data center in the international DataCite consortium [7]. While conceptualizing the first draft of the underlying API specification for the infrastructure and implementing the initial prototype, it became clear that there is the potential for a more comprehensive software environment. In the meanwhile, e!DAL has grown toward a generic, communityaccepted on-premise data archiving and publication infrastructure. It is part of infrastructure programs like ELIXIR [8, 9], NFDI [10], and institutional infrastructures [11].

Almost all scientists in public research are obligated and willing to share all the data they produce with the research community. This has been and is the practice in almost all cases [12]. The FAIR principles, which stand for Findability, Accessibility, Interoperability, and Reusability, were initially drafted by the FORCE11

workgroup in 2015 [13]. In 2016 they were finalized and published by Wilkinson et al. [14]. Today they are widely accepted. The research community shows an increasing awareness of the value of FAIR-compliant data, and the principles are increasingly adopted in several data management guidelines and funding policies [15–17].

Nevertheless, the concrete technical implementation is not a matter of course, and many data repositories invest a lot to improve their infrastructures for seamless integration in the data life cycle and to acquire and share the deposited data in a FAIR manner. One can categorize them into three classes. First, general-purpose data repositories (I) like figshare [18], Dryad [19], or Zenodo [20] are not focused on any specific data type. They are widely accepted and usually, the first point of contact when sharing cross-domain data that is not in the scope of established repositories. Next, core data deposition databases (II) are widely accepted in the scientific community, and they usually evolved over many years and have sustainable funding in the background like the ELIXIR depositions databases [21], the comprehensive resources of the European Bioinformatics Institute (EBI) [22] or the different NCBI databases [23]. Last but not least, there is a large group of domain-specific repositories and databases developed and hosted by certain research institutions (III), such as PlabiPD, GBIS, and PHIS [24].

The decisive common feature of all mentioned systems is the method of data ingestion. Each data file is required to be uploaded by the owner from the local storage location to the respective system. Depending on the data granularity, such as the number of files and subdirectories, the data volume, or the supported network protocol, this can be very time-consuming [25] and does not even consider various manual processes such as metadata preparation, data formatting, and of course the actual upload process.

Repositories of category (II) provide complex submission processes and specific metadata schema, which requires bioinformaticians or data stewards who are responsible and trained to operate the specific submission pipelines and give support to researchers. On the other hand, institutional infrastructures of category (III) require data stewards as well as additional skilled IT staff and computer scientists to setup, customize existing open-source solutions, or develop a home-brewed infrastructure as well as budget for maintaining the underlying storage and network infrastructure. In some cases, the institutional repositories are too specific to attract substantial data volume because of their mainly projectdriven development. Therefore, such repositories often have a limited lifetime due to high costs for sustainable solutions, especially for permanent staff. The infrastructures of category (I) seem to be an ideal alternative because they are usually very comfortable and provide a comprehensive feature set. Nevertheless, they do not implement rich and harmonized metadata and data to feature integrated, linked, or big data applications, for example, to supply

2.2 Infrastructureto-the-Data Approach (I2D)

comprehensive training sets for AI-based models or machine learning algorithms. Furthermore, only a limited amount of fee-less data storage is usually offered. This makes it necessary to request additional budget to publish large and comprehensive datasets, which is difficult to communicate and explain to funding agencies, especially when the requesting institution has sufficient in-house storage capacities but lacks a suitable infrastructure to use it for FAIR data publication.

Consequently, researchers tend to share only condensed and aggregated datasets and limit the re-analysis based on primary data to support their research articles and prove their results. Even if they are willing to share all their data in general, only a few actually provide their data as a whole [12].

As described in the previous section the existing classes of data publication infrastructures have several drawbacks, which made a complementary infrastructure concept necessary to share comprehensive research data. To address a multipurpose but domain specific configurable, the need to share high volume and number of primary data files and the limitations caused by network bandwidth and data privacy, the idea was to apply an on-premises approach following the infrastructure-to-the-data (I2D) concept [26].

Common data publication pipelines using one or more of the previously mentioned databases or platforms usually involve a timeconsuming data transfer by uploading the data via provided web applications or file clients. Another shortcoming provided by many services is the limited amount of free-of-charge storage. Such additional costs for additional storage are on top of already granted hardware storage of institutions. However, suitable software infrastructures to share the stored data in a FAIR-compliant way enable the reuse of this investment and save funding for external data hosting. Therefore the e!DAL software follows the I2D concept by using a data publication layer to encapsulate the existing institutional infrastructure and benefit from an in-house available storage capacity. This layer acts as a broker to the data publication service of the DataCite [7] and contains an infrastructure for submitting research data, reviewing files and metadata, and delivering DOIs [27]. Figure 1 shows schematically the basic idea and differences between the data-publication-as-a-service approach and the datapublication-on-premises approach of the I2D concept.

3 The Electronic Data Archive Library (e!DAL)

The electronic Data Archive Library or e!DAL, for short, is a comprehensive but lightweight infrastructure software for managing and publishing research data. It combines common file-system concepts and research data publication features. On one hand, it is

Fig. 1 Comparison of data-publication-as-a-service concept and data-publication-on-premise concept. Both approaches feature FAIR data publication but differ in costs. The data-publication-as-a-service approach, which is shown on the left side, costs a fee, delegates data property control, and faces capacity limits in storage and data upload. On the right side, the data-publication-on-premise approach is presented, which keeps the data in-house but requires the availability of a central server and storage hardware as well as the installation of the e!DAL software

delivered as an extensible and customizable software framework, which can be integrated and used in existing infrastructures and frontends, and on the other hand, it delivers a stand-alone serverclient architecture to setup a data repository infrastructure. The Plant Phenomics and Genomics Research Data Repository (e! DAL-PGP), which is the reference implementation and is the first productive instance based on the e!DAL infrastructure will be described in this section. Subsequently, the software design and the most important functions will be explained.

3.1 Software Design and Implementation Concept

e!DAL is implemented in Java to feature platform-agnostic use and development in the JVM ecosystem. The wide distribution of the JVM ecosystem and the large developer community allows it to implement comprehensive functions by using established libraries. This reduces the implementation effort and is helpful for providing stable and constant maintenance in the future. The object-oriented API design and aspect-oriented implementation of common patterns make e!DAL scalable. It consists of a modularization concept to encapsulate functions of common concerns, like IO, network, metadata, storage, search, etc., which are necessary to meet the challenges and guarantee a sustainable infrastructure. The following listing briefly mentions the functions of the e!DAL software.

• Version Management: For reproducibility reasons, tracking the history of all stored data files and corresponding metadata is possible. Furthermore, a strict deletion policy forbids removing already stored datasets and allows only to mark a version chain as closed.

• Metadata Management: To guarantee persistent data access and readability, every data entity is annotated with a minimal set of technical metadata, which is crucial for searching and filtering data and support and automatic processing without the need for semantic interpretation.

3.1.1 Version Management

• Information Retrieval: Due to a large number of expected data files and to provide efficient data findability, an index-based search engine was integrated, providing enhanced functions like a faceted search or fuzzy queries across the provided metadata. A standardized data harvesting interface is provided to connect and interlink the stored data.

• Data Protection: To protect the archived data against misuse, a fine-grained but intuitive security system is integrated, which can protect specific data entities and define rules for single users and groups. Additionally, a single sign-on-based authentication was integrated to lower the barrier for the user.

• Data Publication: The major function of the infrastr ucture is the sharing of research data. Therefore, a comprehensive data publication module including an embedded and peer-reviewbased data submission and publication procedure, was integrated.

• Integrability: The overall focus of the development was on the generality of the infrastructure in order to be universally usable, which refers to a self-configuration of the software, an easy setup and maintenance, as well as a user-friendly and generic design to be easily integrated into existing workflows and tools.

The following sections describe the functional components just mentioned in detail.

For reasons of intuitiveness, the e!DAL data structure is adapted from common file-system infrastructures and consists of a hierarchy of files and folders, which is complex enough for organizing comprehensive datasets but understandable by every user and compatible with storage backends, like file systems, object stores or even distributed storage. For data objects to store in a particular backend, any additional technical metadata is managed independently from the concrete storage backend. This enables having a broad range of metadata and its versioning.

An embedded version-control system features the traceability of changes. Thereby every version is associated with a bundle of a data entity and a set of metadata. In order to comply with the data life cycle and to audit the complete file and metadata history the e! DAL data structure records, every single data entity, or metadata update results in a new version. The version-control mechanism also prevents erasing of stored or published datasets; rather, it creates a final version and closes the version chain by tagging it with a “deleted” flag. This keeps all previous versions available and prevents at the same time the creation of new versions. The provided version control is embedded in the infrastructure design and is automated, which makes it very intuitive for the user and guarantees a complete tracking and auditing of ingested research data. To achieve this, the default implementation of e!DAL extends

3.1.2 Metadata Management

the classical file-system capability with the flexibility of a relational database to manage fine grain metadata. Using object-relational mapping techniques, e!DAL is agnostic to a concrete database system. Thus, e!DAL can be customized to use existing database infrastructure on site or use the embedded H2 for default as a metadata persistence layer.

Besides the specific version handling, another difference of the e! DAL infrastructure, in comparison to common file systems, is integrated metadata handling. While classical file systems track only a limited number of metadata for every data object directly like the file owner or the file creation data, other metadata can be only stored in separate text files or condensed in smart “selfexplaining” files names, which is not very comfortable and errorprone. Cloud-based file stores such as object stores such as Amazon S3 [28] overcome these shortcomings, but in practice they are not yet rolled out in institutional storage infrastructures. The e!DAL infrastructure solves this issue by connecting every single file or folder version with its own set of standardized metadata. The used schema is inspired by the DublinCore and DataCite metadata schemata and contains a set of technical attributes, which are one side necessary to describe the actual dataset and guarantee its longterm readability and, on the other side, mandatory to assign a DOI when publishing the dataset. Table 1 shows an overview about the metadata attributes, which are provided within the e!DAL data structure.

Attributes Description

Title A given name to the data resource

Creators Responsible people or organizations for creating the data

Contributors People or organizations contributed to the data creation

Description A brief abstract of the data content

Subjects List of specific keywords describing the data

Publisher A responsible organization for making the data available

Language The language of the data resource

Dates Linked timepoint or period of an event in the data lifecycle

Rights Information about any rights held in and over the data

Format The MIME type of the data

Size Information about the size of the data file or data set

Checksum One or more generated checksums for the data

Table

3.1.3 Information Retrieval

In general, there are some recommendations but no fixed definitions about the concrete data types which need to be used to describe the different fields. To lower the barriers for the data provider, e!DAL provides several data types and a mapping to define which data type fits the different attributes. This makes the description easier and provides a standardized schema and values.

In contrast to common file systems, which usually provide only very elementary search capabilities, e!DAL provides, besides the support for data navigation following the hierarchically organized data folder structure, an index-based search function bundle. It automatically creates a full-text index over all metadata attributes and selected content data during creating data entities, which is crucial to make data findable within a large number of expected files. The search is very powerful using the index and provides a comprehensive set of different functions as well as settings like a faceted search or fuzzy queries across the provided metadata.

In addition to the provided API function, e!DAL also provides an embedded user interface (UI) component for using the comprehensive search functions and makes them easily usable during the establishment of an e!DAL-based repository. A screenshot of the search interface is shown in Fig. 2

Fig. 2 Screenshot of the e!DAL embedded search features. Here is an example of a specific search within the e!DAL-PGP repository is illustrated

3.1.4 Data Privacy

3.1.5 Data Publication

Protecting research data is important for reusability and crucial for the acceptance of the infrastructure by the community in general and by users in particular. The user requests for protection against misuse; therefore, e!DAL has an integrated and fine-granular access control system with sensible methods already on a low API level. Executing code is owned by a particular subject, which can be a trusted digital identity resulting from an authentication process. This could be, for example, access principal bound to, a user name, email, or a user group. It is assigned within a single-sign-on authentication process when initiating a session to the e!DAL infrastructure. This supports local authentication services such as logins of the operating system as well as OAuth or SAML-based AAI like federated AAI, Google, ORCID, or others. To lower the barrier, an authentication module based on the ELIXIR AAI [29] allows many researchers to use their existing institutional identity provider to log in without implementing proprietary user management. This features a trusted infrastructure and is comfortable for the user, decreasing the effort to host an e!DAL-based infrastructure.

Besides the capability to define access rules and protect specific methods on stored datasets, another important aspect of data security and reusability is a transparent and reliable definition of the permissions and licenses for data reuse. This legal and authorization information can be stored in the metadata set and can be therefore defined for every single version of a stored data entity if necessary. Due to the generic character of the infrastructure and the diversity of the research data, there is no concrete license fixed. Nevertheless, for the provided graphical user interface (GUI) of the embedded submission tool, which is described in the next section, the license model of the Creative Commons (CC) was integrated, and one of the seven provided CC licenses can be selected, because they are suitable for a broad field of research data.

Generally, it is impossible to embed comprehensive research data directly into a research article. Either because they are way too big or because there is no suitable graphical representation to show them in an article. Thus, the authors add additional links to make the data accessible. Usually, this is encoded as proprietary IDs of core databases or URLs. To make these stable and long-term resolvable, it is necessary to support globally unique and persistent identifiers (PUIDs), which are widely accepted and fulfill international standards because this is crucial to ensure that research data is citable. Furthermore, this enables credit to the data producer and curator. A central resolver service ensures the long-term resolvability of a PUID to a valid URL. A specific interface in e!DAL was designed to bind to PUID services. For the reference implementation, a connection to the DataCite REST API was implemented to mint DOIs. Obviously, DOIs and PUIDs, in general, are immutable and represent scientific publications, which makes it necessary

Fig. 3 Integrated data review and approval process. The flow chart on the left side explains the data submission and review procedure. Thereby the communication between the reviewers and the infrastructure, as well as the data producers, is handled via email with embedded action links on the right side. The decisions of the reviewers during the approval process were evaluated using a decision matrix, which is also shown on the right side

that the assignment is permitted and controlled by a responsible publisher. Therefore, a customizable, peer-review-inspired approval workflow was designed. The workflow is schematically shown in Fig. 3, and it allows the evaluation of data publication requests by a hierarchy of reviewer decisions encoded as a decision matrix.

This process is embedded in the publication module of e!DAL and can be customized for any connected PUID service. It is necessary to pass the datasets and their corresponding metadata through an internal review process before being published to guarantee a sustainably described data annotation and prevent legal violations. For the default configuration, two reviewer groups are distinguished. The scientific reviewers have scientific expertise in the author ’s research area and are asked to evaluate the dataset for scientific data quality. The administrative reviewers are responsible to check for publication rights and potential conflicts with respect to intellectual property and patent regulations of the authors and affiliated institutions. Each data submission triggers a notification for the reviewers, who then exclusively or collaboratively decide to permit or reject it. The reviewer ’s ratings are combined with a configurable decision matrix to make a final decision. The default

matrix, which is also shown in Fig. 3, lets a scientific reviewer or their assistant decide for rejection or acceptance. The administrative reviewer may confirm their decisions or draw a veto that rejects the entire submission. This procedure increases the data quality and prevents accidentally publishing secret information. The default configuration is intuitive and guarantees performant processing, but it can also be customized to reflect further institutional policies. After approval by the reviewers, a DOI is assigned as a permanent reference for data sharing. To make the process intuitive and as fast as possible for data providers and reviewers, the workflow is based on an asynchronous email notification system with reminders.

In order to make the data submission and the mentioned approval process usable, an intuitive and embedded GUI for data description and submission was developed and embedded.

The publication process is initiated by the data and metadata submission tool, as shown in Fig. 4. This comprises an upload of the dataset and mandatory technical metadata, like a title, keywords, and a suitable license. Some of them, for example, file size,

Fig. 4 Graphical user interface of the desktop version of the data submission tool for the e!DAL-PGP repository

3.1.6 Integrability

file type, or file format, will be automatically determined. Furthermore, the tool supports and enables the reuse of metadata typed in previous sessions during the use of the system. Optionally the data submitter may define an embargo date to hide the public access to the content page, which provides access to the dataset at a defined time point, which is useful when the dataset is subject to certain project restrictions.

In order to mimic known submission processes, the design and handling of the graphical user interface reuse design patterns known from scientific journals or research conferences. After the data upload process to the connected e!DAL-based repository is completed, and the reviewers are notified of the publication request by an email that includes a preview URL with restricted access to the submitted files and metadata.

In case the data submission was rejected, the requesting user is informed and asked to revise the dataset and metadata. After a successful review, the author is notified and can confirm the submission to finally assign the DOI or alternatively discard the submission if any change is necessary. The released DOI will be registered and sent out with an additional email. Alternatively, the author can wait to assign a DOI and use the temporary preview link from the notification, for example, to provide access to the reviewers of a manuscript.

The overall goal during the conceptualization of e!DAL and the development of the reference implementation provided a flexible and generic infrastructure. In concrete, this means it should not only be easy to use but rather also configurable and maintainable with low effort. This is crucial to be able to integrate the infrastructure in diverse existing workflows and solutions for supporting researchers in their daily work.

Therefore, e!DAL is implemented in Java, which is still one of the most popular programming languages and is widely used in the life science community. Furthermore, it allows it to be used platform independent. e!DAL supports stand-alone and client-server architectures and can be integrated into Java applications as an embedded local archive or as a central archiving system. Thus several applications or project-specific central repositories can be easily operated. All needed and previously described components are already fully embedded within the infrastructure and the reference implementation. To setup e!DAL no additional technology like a database, an indexing engine, or a web server is needed. No specific knowledge is required because the setup of e!DAL is more or less self-configuring. To initiate a local or a central instance, the user needs to provide several general connection parameters, which is required to connect the system with different online services like DataCite for assigning DOIs, and some organizational parameter

3.2 Software Quality and Sustainable Documentation

which are needed to use the full functionality like the reviewer ’s email addresses for the embedded approval process. The comprehensive set of diverse login modules, which is already integrated, facilitates the usage of the infrastructure.

Besides the generic character of the e!DAL infrastructure and their reference implementation as e!DAL-PGP repository at IPK Gatersleben, another important aspect during the development was to guarantee high software quality and usability, resulting in diverse challenges in reusing existing software [30]. Therefore, the quality of research software is important and is getting increasingly focused, which also results in several approaches and recommendations for maintainable, reusable, and high-quality research software [31].

Following lessons learned in the past decade of active software development in research projects, published studies [32], and guidelines for software development [33] are used to guarantee the quality and sustainability of the e!DAL software stack, One focus was to set the platform independency, which was addressed by implementing e!DAL in Java. Furthermore, we consequently used the opportunity to use open-source libraries, which extend the native Java API, to avoid home-brewed code. There is a huge community that maintains comprehensive libraries. Therefore the reference implementation of e!DAL consequently used several established libraries and frameworks like Hibernate [34] for object persistence, AspectJ [35] for performant code-weaving and Lucene [36] for the search engine. This supports the use of current, bestperforming code and prevents wasting time on developing errorprone code for core components such as security, database connection, search functionality, or user management. Furthermore, JVM-based code is widely used in the bioinformatics community to develop graphical user interfaces, algorithms, data management infrastructures, and web apps.

Furthermore, parallel to the development of the API and the core code, a comprehensive test suite based on JUnit was created, guaranteeing a permanent quality and performance evaluation during the development. Obviously, the comprehensive functionality results in a large amount of code and embedded libraries, which requires to application build and dependency management tools to maintain the infrastructure in the long term and the regular release of stable versions. To meet these challenges, the development and release process is mastered using Gradle Build Management. It allows fast and specific build cycles by supporting multi-core systems, which makes it possible to run a large set of unit tests in parallel. The different components of the e!DAL infrastructure is thereby organized in a multi-build project, which contains the main API components, including the reference implementation and the components for the server-client architecture, which is directly

based on this core implementation. The whole project is stored in a GIT repository on BitBucket [37], and furthermore, every release is published in Maven Central as an artifact [38], which allows embedding the library easily using build management tools like Maven or Gradle.

The next important aspect is to improve the acceptance of the e!DAL infrastructure is comprehensive software documentation for experienced developers as well as sample code and training material for casual programmers, who usually have a basic programming knowledge. A comprehensive source code documentation is collected in a JavaDoc and automatically created during every release process. It is published as a bundle with the above-mentioned artifact and is also available on the project website [39]. The latter also provides different usage examples as code snippets, which help get started with the software and show how to set up and conFig. own repository instances. Furthermore, some video tutorials, presentation recordings, and associated publications are available to lower the initial barriers to digging into the e!DAL infrastructure. The website is automatically updated during the comprehensive release process.

4 Plant Phenomics and Genomics Research Data Repository

Based on the previously described e!DAL software infrastructure, in 2015, the first productive repository which follows the I2D approach, was released at the IPK Gatersleben to address the challenging data publication needs within the German Plant Phenotyping Network (DPPN). The now institutional operated “Plant Phenomics and Genomics Research Data Repository” (e!DALPGP) features the full potential of the IPKs capabilities as DataCite data center for assigning DOIs. Initially, it was limited to just IPK users as an intuitive solution to publish their diverse and large plantrelated research datasets. After several successful data publications and some slight improvements, the submission was promptly extended by integrating the ELIXIR AAI [29] and supporting external submissions.

The e!DAL-PGP repository covers in particular cross-domain datasets, which are not supported to get published in other public repositories for reasons of data volume or data domain, such as phenotyping images, genotyping data, visualizations of morphological models, and data from mass spectrometry, as well as software and related documents. In doing so, in mid-2022 e!DAL-PGP provides 250 data records that can be referenced via DOIs and annotated with technical metadata. These datasets comprise over 1.5 million files with an overall volume of >4.8 TB, as shown in Fig. 5

Fig. 5 Overview of the data access statistics and development of the data stock provided by the e!DAL-PGP repository. The left diagram shows the increasing number of stored datasets separated by number of files and overall volume. The right diagram shows the overall volume of downloaded data and the number of different user accesses to the published datasets

To ensure data findability and accessibility, e!DAL-PGP provides content pages (see Fig. 6) for every dataset containing JSONLD formatted metadata and is therefore harvestable through web crawler services like Google or Microsoft, which follow the Schema. org recommendations [40] for structured data.

To support scientists in disseminating their research data e! DAL-PGP is accepted as a recommended institutional repository for the journals such as Scientific Data (Nature Publishing Group), GigaScience (Oxford Academic), and others. Furthermore, it is registered in re3data.org [41], FAIRsharing.org [42], OpenAIRE [43], and DataCite [7]. The benefits of this wide support of data discovery enabling technologies and data publication, in general, are proven by the constantly increasing number of data accesses. By mid-2022, e!DAL-PGP delivered 900 TB of data, and the provided datasets have been accessed by 180,000 unique clients.

5 Summary

This chapter presented the design principles of an infrastructure-tothe-data approach (I2D) to implement a “on-premises’‘data management and data publication approach. A proof of principle was given by the presented e!DAL infrastructure. It is a comprehensive infrastructure and compliant with the FAIR data principles. The development was driven by the claim to provide a flexible setup and easy integration into existing infrastructures and the daily research process. The described I2D concept differs from generic publication platforms like figshare, DRYAD, or Zenodo. Their business model is based on considerable client investments depending on the needed storage and additional data transfer time. In case of an

Another random document with no related content on Scribd:

The Project Gutenberg eBook of Orders conceived and published by the Lord Major and Aldermen of the City of London, concerning the infection of the plague

This ebook is for the use of anyone anywhere in the United States and most other parts of the world at no cost and with almost no restrictions whatsoever. You may copy it, give it away or re-use it under the terms of the Project Gutenberg License included with this ebook or online at www.gutenberg.org. If you are not located in the United States, you will have to check the laws of the country where you are located before using this eBook.

Title: Orders conceived and published by the Lord Major and Aldermen of the City of London, concerning the infection of the plague

Creator: City of London . Court of Aldermen City of London . Lord Mayor

Release date: February 12, 2024 [eBook #72934]

Language: English

Original publication: London: James Flesher, 1665

Credits: Daniel Lowe and the Online Distributed Proofreading Team at https://www.pgdp.net (This file was produced from images generously made available by The Internet Archive)

*** START OF THE PROJECT GUTENBERG EBOOK ORDERS CONCEIVED AND PUBLISHED BY THE LORD MAJOR AND ALDERMEN OF THE CITY OF LONDON, CONCERNING THE INFECTION OF THE PLAGUE ***

ORDERS

CONCEIVED AND PUBLISHED

By

The Lord MAJOR and Aldermen of the City of L, concerning the Infection of the Plague.

Printed by James Flesher, Printer to the Honourable City of L.

ORDERS

Conceived and published by the Lord M and Aldermen of the City of London, concerning the infection of the Plague.

Whereas in the first Year of the Reign of our late Sovereign King James of happy memory, an Act was made for the charitable relief and ordering of Persons infected with the Plague: whereby Authority was given to Justices of Peace, Majors, Bayliffs, and other HeadOfficers to appoint within their several Limits Examiners, Searchers, Watchmen, Keepers, and Buriers for the Persons and Places infected, and to minister unto them Oaths for the performance of their Offices. And the same Statute did also authorize the giving of other Directions, as unto them for the present necessity should seem good in their discretions. It is now upon special consideration thought very expedient for preventing and avoiding of infection of Sickness (if it shall so please Almighty God) that these Officers following be appointed, and these Orders hereafter duly observed.

Examiners to be appointed in every Parish.

First, it is thought requisite and so ordered, that in every Parish there be one, two, or more persons of good sort and credit, chosen and appointed by the Alderman, his Deputy, and Common-Councel of every Ward, by the name of Examiners, to continue in that Office the space of two Moneths at least: And if any fit Person so appointed, shall refuse to undertake the same, the said parties so refusing, to be committed to Prison until they shall conform themselves accordingly.

The Examiners Office.

That these Examiners be sworn by the Alderman, to enquire and learn from time to time what Houses in every Parish be visited, and what persons be sick, and of what Diseases, as near as they can inform themselves; and upon doubt in that case, to command restraint of access, until it appear what the Disease shall prove: And if they finde any person sick of the Infection, to give order to the Constable that the House be shut up; and if the Constable shall be found remiss or negligent, to give present notice thereof to the Alderman of the Ward.

Watchmen.

That to every Infected House there be appointed two Watchmen, one for the Day, and the other for the Night: And that these Watchmen have a special care that no person goe in or out of such infected Houses, whereof they have the Charge, upon pain of severe punishment. And the said Watchmen to doe such further Offices as the sick House shall need and require: And if the Watchman be sent upon any business, to lock up the House and take the Key with him:

and the Watchman by day to attend until ten of the clock at night: and the Watchman by night until six in the morning.

Searchers.

That there be a special care, to appoint Women-Searchers in every Parish, such as are of honest reputation, and of the best sort as can be got in this kind: And these to be sworn to make due search and true report, to the utmost of their knowledge, whether the Persons, whose bodies they are appointed to Search, do die of the Infection, or of what other Diseases, as near as they can. And that the Physicians who shall be appointed for cure and prevention of the Infection, do call before them the said Searchers who are or shall be appointed for the several Parishes under their respective Cares, to the end they may consider whether they are fitly qualified for that employment; and charge them from time to time as they shall see cause, if they appear defective in their duties.

That no Searcher during this time of Visitation, be permitted to use any publick work or imployment, or keep any Shop or Stall, or be imployed as a Landress, or in any other common imployment whatsoever

Chirurgions.

For better assistance of the Searchers, for as much as there hath been heretofore great abuse in misreporting the Disease, to the further spreading of the Infection: It is therefore ordered, that there be chosen and appointed able and discreet Chirurgions, besides those that doe already belong to the Pest-house: amongst whom, the City and Liberties to be quartered as the places lie most apt and convenient: and every of these to have one quarter for his Limit: and the said Chirurgions in every of their Limits to joyn with the

Searchers for the view of the body, to the end there may be a true report made of the Disease.

And further, that the said Chirurgions shall visit and search such like persons as shall either send for them, or be named and directed unto them, by the examiners of every Parish, and inform themselves of the Disease of the said parties.

And for as much as the said Chirurgions are to be sequestred from all other Cures, and kept onely to this Disease of the Infection; It is ordered, that every of the said Chirurgions shall have twelve-pence a Body searched by them, to be paid out of the goods of the party searched, if he be able, or otherwise by the Parish.

Nurse-keepers.

If any Nurse-keeper shall remove herself out of any infected House before 28 daies after the decease of any person dying of the Infection, the House to which the said Nurse-keeper doth so remove herself shall be shut up until the said 28 daies be expired.

Orders concerning infected Houses, and Persons sick of the Plague.

Notice to be given of the Sickness.

The Master of every House, as soon as any one in his House complaineth, either of Botch or Purple, or Swelling in any part of his body, or falleth otherwise dangerously sick, without apparent cause of some other Disease, shall give knowledge thereof to the Examiner of Health within two hours after the said sign shall appear.

Sequestration of the Sick.

As soon as any man shall be found by this Examiner, Chirurgion or Searcher to be sick of the Plague, he shall the same night be sequestred in the same house. And in case he be so sequestred, then though he afterwards die not, the House wherein he sickned shall be shut up for a Moneth, after the use of due Preservatives taken by the rest.

Airing the Stuff.

For sequestration of the goods and stuff of the infected, their Bedding, and Apparel, and Hangings of Chambers, must be well aired with fire, and such perfumes as are requisite within the infected

House, before they be taken again to use: this to be done by the appointment of the Examiner.

Shutting up of the House.

If any person shall have visited any man, known to be Infected of the Plague, or entred willingly into any known Infected House, being not allowed: the House wherein he inhabiteth, shall be shut up for certain daies by the Examiners direction.

None to be removed out of Infected Houses, but, &c.

Item, that none be removed out of the House where he falleth sick of the Infection, into any other House in the City, (except it be to the Pest-house or a Tent, or unto some such House, which the owner of the said visited House holdeth in his own hands, and occupieth by his servants) and so as security be given to the Parish whither such remove is made, that the attendance and charge about the said visited persons shall be observed and charged in all the particularities before expressed, without any cost of that Parish, to which any such remove shall happen to be made, and this remove to be done by night: And it shall be lawful to any person that hath two Houses, to remove either his sound or his infected people to his spare House at his choice, so as if he send away first his sound, he may not after send thither the sick, nor again unto the sick the sound. And that the same which he sendeth, be for one week at the least shut up and secluded from company for fear of some infection, at the first not appearing.

Burial of the dead.

That the Burial of the dead by this Visitation be at most convenient hours, alwaies either before Sun-rising, or after Sun-setting, with the privity of the Churchwardens or Constables, and not otherwise; and that no Neighbours nor Friends be suffered to accompany the Coarse to Church, or to enter the house visited, upon pain of having his house shut up, or be imprisoned.

And that no Corps dying of Infection shall be buried or remain in any Church in time of Common-Prayer, Sermon, or Lecture. And that no children be suffered at time of burial of any Corps in any Church, Church-yard, or Burying-place to come near the Corps, Coffin, or Grave. And that all the Graves shall be at least six foot deep.

And further, all publick Assemblies at other Burials are to be forborn during the continuance of this Visitation.

No infected Stuff to be uttered.

That no Clothes, Stuff, Bedding or Garments be suffered to be carried or conveyed out of any infected Houses, and that the Criers and Carriers abroad of Bedding or old Apparel to be sold or pawned, be utterly prohibited and restrained, and no Brokers of Bedding or old Apparel be permitted to make any outward Shew, or hang forth on their Stalls, Shopboards or Windows toward any Street, Lane, Common-way or Passage, any old Bedding or Apparel to be sold, upon pain of Imprisonment. And if any Broker or other person shall buy any Bedding, Apparel, or other Stuff out of any Infected house, within two Moneths after the Infection hath been there, his house shall be shut up as Infected, and so shall continue shut up twenty daies at the least.

No person to be conveyed out of any infected House.

If any person visited do fortune, by negligent looking unto, or by any other means, to come, or be conveyed from a place infected, to any

other place, the Parish from whence such Party hath come or been conveyed, upon notice thereof given, shall at their charge cause the said party so visited and escaped, to be carried and brought back again by night, and the parties in this case offending, to be punished at the direction of the Alderman of the Ward, and the house of the receiver of such visited person to be shut up for twenty daies.

Every visited house to be marked.

That every House visited, be marked with a Red Cross of a foot long, in the middle of the door, evident to be seen, and with these usual Printed words, that is to say, Lord have mercy upon us, to be set close over the same Cross, there to continue until lawful opening of the same House.

Every visited House to be watched.

That the Constables see every house shut up, and to be attended with Watchmen, which may keep them in, and minister necessaries unto them at their own charges (if they be able,) or at the common charge if they be unable: the shutting up to be for the space of four Weeks after all be whole.

That precise order be taken that the Searchers, Chirurgions, Keepers and Buriers, are not to pass the streets without holding a red Rod or Wand of three foot in length in their hands, open and evident to be seen, and are not to goe into any other house then into their own, or into that whereunto they are directed or sent for, but to forbear and abstain from company, especially when they have been lately used in any such business or attendance. Inmates.

That where several Inmates are in one and the same house, and any person in that house happen to be infected; no other person or family of such house shall be suffered to remove him or themselves without a Certificate from the Examiners of Health of that Parish; or in default thereof, the house whither he or they so remove, shall be shut up as in case of Visitation.

Hackney Coaches.

That care be taken of Hackney Coachmen, that they may not (as some of them have been observed to doe) after carrying of infected persons to the Pesthouse, and other places, be admitted to common use, till their Coaches be well aired, and have stood unimployed by the space of five or six daies after such service.

Orders for cleansing and keeping of the Streets sweet.

The Streets to be kept clean.

First, it is thought very necessary, and so ordered, that every Householder do cause the street to be daily pared before his door, and so to keep it clean swept all the Week long.

That Rakers take it from out the Houses.

That the sweeping and filth of houses be daily carried away by the Rakers, and that the Raker shall give notice of his coming by the blowing of a Horn as heretofore hath been done.

Laystalls to be made farre off from the City.

That the Laystalls be removed as farre as may be out of the City, and common passages, and that no Nightman or other be suffered to empty a Vault into any Garden near about the City.

Care to be had of unwholesome Fish or Flesh, and of musty Corn.

That special care be taken, that no stinking Fish, or unwholsome Flesh, or musty Corn, or other corrupt fruits of what sort soever, be suffered to be sold about the City or any part of the same.

That the Brewers and Tipling-houses be looked unto, for musty and unwholsome Cask.

That no Hogs, Dogs, or Cats, or tame Pigeons, or Conies be suffered to be kept within any part of the City, or any Swine to be, or stay in the Streets or Lanes, but that such Swine be impounded by the Beadle or any other Officer, and the Owner punished according to Act of Common-Councel, and that the Dogs be killed by the Dogkillers appointed for that purpose.

Orders concerning loose Persons and idle Assemblies.

Beggers.

Forasmuch as nothing is more complained of, then the multitude of Rogues and wandering Beggers that swarm in every place about the City, being a great cause of the spreading of the Infection, and will not be avoided, notwithstanding any Order that hath been given to the contrary: It is therefore now ordered, that such Constables, and others whom this matter may any way concern, do take special care that no wandering Begger be suffered in the Streets of this City, in any fashion or manner whatsoever upon the penalty provided by the Law to be duly and severely executed upon them.

Playes.

That all Playes, Bear-baitings, Games, Singing of Ballads, Bucklerplay, or such like causes of Assemblies of people, be utterly prohibited, and the parties offending, severely punished by every Alderman in his Ward.

Feasting Prohibited.

That all publick Feasting, and particularly by the Companies of this City; and Dinners at Taverns, Alehouses, and other places of

common entertainment be forborn till further order and allowance; and that the money thereby spared, be preserved and imployed for the benefit and relief of the poor visited with the infection.

Tipling-houses.

That disorderly Tipling in Taverns, Alehouses, Coffee-houses and Cellars be severely looked unto, as the common Sin of this time, and greatest occasion of dispersing the Plague. And that no Company or person be suffered to remain or come into any Tavern, Alehouse or Coffee-house to drink after nine of the Clock in the Evening, according to the ancient Law and custome of this City, upon the penalties ordained in that behalf.

And for the better execution of these Orders, and such other Rules and Directions as upon further consideration shall be found needful; It is ordered and enjoyned that the Aldermen, Deputies, and Common-Councelmen shall meet together Weekly, once, twice, thrice or oftner (as cause shall require) at some one general place accustomed in their respective Wards (being clear from infection of the Plague) to consult how the said Orders may be duly put in execution; not intending that any dwelling in or near places infected, shall come to the said meetings whiles their coming may be doubtful: And the said Aldermen and Deputies and Common Councelmen in their several Wards may put in execution any other good Orders that by them at their said Meetings shall be conceived and devised, for preservation of his Majesties Subjects from the Infection.

FINIS.

Transcriber’s Notes:

Some inconsistencies in spelling, hyphenation, and punctuation have been retained.

New original cover art included with this eBook is granted to the public domain.

*** END OF THE PROJECT GUTENBERG EBOOK ORDERS

CONCEIVED AND PUBLISHED BY THE LORD MAJOR AND ALDERMEN OF THE CITY OF LONDON, CONCERNING THE INFECTION OF THE PLAGUE ***

Updated editions will replace the previous one—the old editions will be renamed.

Creating the works from print editions not protected by U.S. copyright law means that no one owns a United States copyright in these works, so the Foundation (and you!) can copy and distribute it in the United States without permission and without paying copyright royalties. Special rules, set forth in the General Terms of Use part of this license, apply to copying and distributing Project Gutenberg™ electronic works to protect the PROJECT GUTENBERG™ concept and trademark. Project Gutenberg is a registered trademark, and may not be used if you charge for an eBook, except by following the terms of the trademark license, including paying royalties for use of the Project Gutenberg trademark. If you do not charge anything for copies of this eBook, complying with the trademark license is very easy. You may use this eBook for nearly any purpose such as creation of derivative works, reports, performances and research. Project Gutenberg eBooks may be modified and printed and given away—you may do practically ANYTHING in the United States with eBooks not protected by U.S. copyright law. Redistribution is subject to the trademark license, especially commercial redistribution.

START: FULL LICENSE

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.