RESEARCH PROJECT 1.BIOSTATISTICS IN COMPUTATIONAL TOXICOLOGY
Investigators: Wright (P.I.), Gupta, Zou, Truong, Ibrahim, Lin, Nobel
The field of biostatistics has emerged in the last several decades as an indispensable branch of applied statistics, useful in medical and biological problems in which uncertainty plays a role. In this proposal, we are primarily concerned with biostatistical informatics, the application of biostatistical principles to problems arising in bioinformatics and computational biology. The explosive increase in high-throughput genomics data has brought a pressing need for statistically efficient procedures to quantify and explore these data and to perform meaningful inference. These developments have also become increasingly specialized, involving disparate methods for sequence analysis, preprocessing and statistical inference for so-called “-omics” data, arising from transcriptomic, proteomic, and metabolomic (metabonomic) profiling. This list is by no means exhaustive, and statistical reasoning is now increasingly applied in probabilistic models of gene networks and analysis of complex biological simulation models. Traditional areas of statistical genetics, such as association mapping, are also merging with other high-throughput methods such as transcription profiling, as the density of genotyping assays increases. Data on chemotoxicity profiles from EPA, NIEHS and other sources is markedly increasing, and the interface between Projects 1 and 2 (Cheminformatics) will bring about a merging of techniques that had until recently developed largely independently.
RESEARCH PROJECT 2.CHEMINFORMATICS
Investigators: Tropsha (P.I.), Zheng, Golbraikh, Liu
Project 2 seeks to establish a universally applicable and robust predictive toxicology modeling framework based on rigorous Quantitative Structure Activity/Property Relationships (QSAR/QSPR, used here interchangeably) methodologies. The framework has been refined over many years of our research on QSPR methodology development and application to experimental datasets that led to novel analytical approaches (Zheng and Tropsha, 2000), descriptors (Golbraikh and Tropsha, 2003), model validation schemes (Golbraikh and Tropsha, 2002a;Tropsha et al., 2003), overall QSPR workflow design (Shen et al., 2004b;Kovatcheva et al., 2004), and multiple end-point predictors (see (Tropsha, 2003;Tropsha, 2005) for recent reviews). This Project focuses on the design of optimized QSPR protocols for the development of reliable models of critically important toxicological end point properties with the goal of sharing both modeling software and specialized predictors with the research community via a web-based Predictive Toxicology Portal. Objectives will be achieved via concurrent development of QSPR methodology (Specific objective 1), building highly predictive, robust QSPR models of ADME-Tox properties (Specific objective 2), and the deployment of modeling software and individual predictors via a specialized web-portal (Specific objective 3) as follows.
RESEARCH PROJECT 3.COMPUTATIONAL INFRASTRUCTURE FOR SYSTEMS TOXICOLOGY
The Computational Infrastructure for Systems Toxicology project (Project 3) of the proposed Environmental Bioinformatics Research Center (EBRC) will provide direct solutions to computationally challenging problems in Investigational Areas 1 – Improving Linkages in Source-to-Outcome Paradigm, and 3 – Enhanced Quantitative Risk Assessment. Furthermore, Project 3 will generalize the solution methods into a broad computational infrastructure that will support the other two EBRC projects, as well as EPA at large. In this proposal we describe the goals of Project 3 and its importance to the Center, the driving biological problem that will serve as a primer for the design of this infrastructure, the infrastructure design itself, and implementation issues on high-performance and grid computer facilities. We also describe how the other EBRC projects will interact with Project 3, how our personnel will assist in development and integration of new analysis methods into user-friendly, open-access, web-based tools and resources, and how Project 3 will become a broader resource beyond the Center.
The EPA has elevated the importance of the Center’s role in performing outreach and translation activities (POTA). Our center created, as a complement to our research programs, an equally important functional area, integrated into the research, for deploying bioinformatics technologies into the environmental community; our Translation Group. It is understood that the field of bioinformatics, as defined within the Center, is broad in scope and is changing rapidly with discoveries in science and advances in information technologies. The benefactors of the Center are equally diverse in their needs to understand the application and implications of this cutting-edge science to resolving critical environmental issues. The EPA has identified potential benefactors of the center’s research that span policy-makers, the public and other stakeholders, and science professionals.