Topological descriptor selection for a quantitative structure-activity relationship (QSAR) model to assess PAH mutagenicity Caitlin Sextona, Trevor Sleight, P.E.a,b, Carla Ng, Ph.D.b, Leanne Gilbertson, Ph.D.a Gilbertson Lab Group, bNg Lab Group, Department of Civil and Environmental Engineering a
Caitlin Sexton
Caitlin Sexton is a junior chemical engineering student originally from Allentown, PA. Her interests include the various aspects of sustainability in application to the chemical industry. She aims to use her research experience in environmental hazards and data analysis as a guide in her post-grad career. Trevor Sleight is a 3rd year Ph.D. student, co-advised by Dr. Ng and Dr. Gilbertson. His research interest include environmental health, data analysis and biodegradation.
Trevor Sleight
Abstract
Polycyclic aromatic hydrocarbons (PAHs) are an abundant byproduct of industrial and natural pyrogenic processes. PAHs tend to persist in soil, providing a rich nutrient source for degrading bacteria. The degradation process may lead to the formation of toxic metabolites, however, there is limited research examining the hazards of transformation products in soil. Currently, the Environmental Protection Agency (EPA) classifies 16 toxic PAHs as priority pollutants without addressing the harmful metabolites. This project aims to select descriptors for a quantitative structure-activity relationship (QSAR) model based upon a data set containing TA98 Ames test known mutagens and non-mutagens. A logistic regression model determined 20 significant descriptors representing the molecular features linked to mutagenicity classification. Of these 20 descriptors, the number of rings larger than 12 members containing oxygen, (nFG12HeteroRing), the average centered Broto-Moreau autocorrelation (AATSC6c), and the z-modified information content index (ZMIC4), had the most significant link to mutagenicity classification based upon assessment of the corresponding logistic regression coefficients. These descriptors highlight the molecular structures that contribute to mutagenicity of PAHs within biodegradation pathways.
1. Introduction
Carla Ng is an assistant professor of Civil & Environmental Engineering. Her group focuses on understanding and predicting the biological impacts of chemicals in the environment.
Carla Ng, Ph.D.
Leanne Gilbertson, Ph.D.
Ingenium 2021
Dr. Gilbertson is an Assistant Professor in the Department of Civil and Environmental Engineering at the University of Pittsburgh. Her research group at the University of Pittsburgh is currently engaged in projects aimed at informing sustainable design of emerging materials and technologies proposed for use in areas at the nexus of the environment and public health.
Significance Statement
This study aims to assess the mutagenicity, and therefore environmental hazard, of PAHs in soil using a computational QSAR model. PAH mutagenicity is difficult to assess due to the variety of metabolites which can result from many possible biodegradation pathways. Descriptors discussed in this study will be the basis of the QSAR model.
Category: Computational Research Keywords: PAH, Mutagenicity, QSAR, Logistic regression
Polycyclic Aromatic Hydrocarbons (PAHs) contain at least two aromatic rings and often result as a byproduct of various natural and industrial processes, including forest fires, extraction and burning of fossil fuels, and plastic manufacturing. PAHs are commonly found in the atmosphere and soil of surrounding ecosystems due to these processes. However, PAHs in soil tend to be more persistent and are a possible source of carbon for degrading bacteria [1]. The Environmental Protection Agency (EPA) currently classifies 16 PAHs as priority pollutants [2]. There is increasing concern over the mutagenic properties of some PAHs in the human body. However, the degrading bacteria transform these parent PAHs into various metabolites via a large multitude of pathways, which may have different toxic properties from the parent PAHs, making it difficult to thoroughly assess the potential hazard in the laboratory setting. Quantitative structure-activity relationships (QSARs) have proven to be a useful tool for characterizing the toxicity of large chemical datasets, including those from PAH biodegradation, because of their predictive power [3]. QSARs can classify the endpoint toxic potential of input data based on a foundation of empirical training data and relevant structural and electronic descriptors. This foundation provides reproducible predictive ability for input data containing chemically similar compounds. The training dataset and chosen descriptors used to build a QSAR are crucial to its applicability [4]. There are currently a variety of powerful QSARs available to the public which can predict narcotic toxicity but lack the specificity to classify mutagenic metabolites of PAHs in soil. 81