Public Health Informatics
Guiding Warfarin Clinical Trial Design Using Pharmacogenetic Simulations
Peter J Tonellato, PhD, Kourosh Ravvaz, MD, Chun-Yuan Huang, PhD
Highly-sensitive genetic tests that detect variant alleles combined with increasing genomic knowledge offer physicians the ability to individualize a patient’s drug treatment. If pharmacogenomic treatment is successful, one anticipates a large reduction in adverse drug reactions leading to improved patient care, improved outcomes, reduced treatment periods, and overall lower costs. Unfortunately, it is extremely expensive and time-consuming to conduct the clinical trials to identify the correct combination of genotypes, phenotypes, clinical and personal data necessary to accurately model drug response, test treatment options and produce the 'optimal' protocol. In addition, there are no modeling frameworks to extend the simulations and optimization to population wide studies capable of guiding public health policy. Here, we propose the extension and confirmation of a clinical trial simulation framework to model warfarin dosing and INR response to guide clinical trial design. And we ultimately extend the modeling and simulations to city and county-wide predictions which provides evidence to guide public health policy and help direct limited public health resources to avoid health disparity.
Anti-Coagulant Pharmacogenetic Clinical Trial Simulations to Predict Improved Patient Outcomes
Peter J Tonellato, PhD, Kourosh Ravvaz, MD, Chun-Yuan Huang, PhD, Michael Michalkiewicz, DVM, PhD
Genotype-specific treatments require a large collection of complex clinical trials to identify the combination of genotypes, phenotypes, and clinical data needed to accurately test and identify optimal treatment options in large racially heterogeneous populations. However, these clinical trials are prohibitively expensive. A mathematical modeling approach was previously designed by Prof. Tonellato's group to simulate and predict trial outcomes with the intent to reduce the number, cost and complexity of the trials. In this project, a multidisciplinary research team has joined together to refine and test the model using Aurora Healthcare’s regional patient population EMR data to conduct a series of racially diverse population wide simulations to demonstrate the use of mathematical models and predictive simulation to improve warfarin dosing in a large heterogeneous patient population and provide evidence guiding "optimal" warfarin management policy. Finally, the clinical trial simulation framework will be extended and simulations conducted to predict city, county and regional population outcomes with the aim to produce evidence suggesting how city, county or regional public health care policy may take into consideration genetic testing to guide improved healthcare.
Meaningful Use of EHRs in Identifying Rural-Urban Health Disparities Using Pharmacogenetics-based Clinical Avatar Simulations
Chun-Yuan Huang, PhD, Peter J Tonellato, PhD
Health disparities based on access, standard of care, and other factors associated with gender, race, education, income, and rural-urban residential location exist and are difficult to quantify. One particularly complicated disparity is associated with the relationship between racial residential segregation and health outcomes. One form of health disparity is observed in treatment outcomes and adverse drug events such as those seen when warfarin is used to treat Venous Thromboembolism (VTE). We explore potential health disparities arising in warfarin treatment of VTE associated with rural-urban residency. To accomplish this predictive study, we integrate EHRs with Milwaukee public health data to simulate the entire Milwaukee County population using Clinical Avatars. These Avatar populations are then used to detect the underestimation of warfarin dose that may lead to increased incidence of potentially preventable VTE. By comparing warfarin doses derived from pharmacogenetics-based algorithms to best practice dosing for VTE treatment, this study illustrates that potential “Meaningful Use” of an EHR can reveal health disparities that occurred in the incidence of VTE due to rural-urban differences.
Clinical Avatars for Breast Cancer Risk Prediction
Emmanuel Asante-Asamani, Omid Ghiasvand, Peter J Tonellato, PhD
12.7% of women born in the US will develop breast cancer in their lifetime1. It is imperative to regularly and accurately assess an individual’s risk for breast cancer to determine a personalized course of preventive actions. Direct comparison between the various risk algorithms is extremely difficult as each algorithm was produced using a unique study population and data set. The lack of a method to systematically perform these direct comparisons of important prediction algorithms for Caucasian and minority populations makes it very difficult for healthcare providers to decide which algorithm is best for their patients (especially if their patient population is largely minorities). We have developed a method to perform these direct comparisons, and have systematically examined all of the published breast cancer absolute risk algorithms, developed a relational network for breast cancer risk factors and have started to use our simulated populations to interrogate the algorithms.
Predicting Clinical Validity of Bladder Cancer Nomograms
Kourosh Ravvaz, MD, Tracy M Downs, MD, Peter J Tonellato
Complex early stage bladder cancer has growing impact on individual and population health, health care cost, and medical treatment improvements. Bladder cancer is a heterogeneous disease requiring accurately risk group stratification to precisely predict tumor progression and recurrence and therefore accurately treat even early detected, high-risk patients using intravesical therapy. However, current knowledge of risk and treatment is not fully incorporated into commonly used nomograms. This retrospective study being conducted by a multidisciplinary group of researchers from UWM and UW-Madison Carbone Cancer Center will create a simulation framework to test and adjust existing nomograms to include recent clinical findings to produce “optimal” predictions of risk and outcomes.
Next-Generation Sequence Analysis
Biomedical Informatic Analysis of the RNA of NF1 Associated Malignant Peripheral Nerve Sheath Tumors (MPNST)
Chun-Yuan Huang, PhD, Chih-Lin Chi, PhD, Charles Joseph Murphy, Rui Du, Kourosh Ravvaz, MD, Zhengqiu Cai, PhD, Erik S. Gafni, Peter J Tonellato, PhD, Steven L Carroll, PhD
This research project is conducted by a group of researchers from Dr. Tonellato's laboratories at UWM Zilber School of Public Health (LPHIG) and Harvard Medical School (LPM) and also our collaborators at University of Alabama at Birmingham, Alabama. Patients with neurofibromatosis type 1 (NF1) develop plexiform and intraneural neurofibromas, which in turn transform into aggressive sarcomas known as malignant peripheral nerve sheath tumors (MPNSTs). The genetic abnormalities promoting the initial pathogenesis of neurofibromas and their progression to MPNSTs is poorly understood. Although predicted clinically, molecular subtypes of these tumors have not yet been defined. Accurate tumor subtyping is required to rationally design new targeted therapies and define patient subclasses with differing prognoses and responses to specific therapies. We hypothesize that additional, as yet unknown, "driver" mutations cooperate with NF1 loss of heterozygosity to promote the pathogenesis of plexiform and intraneural neurofibromas and MPNSTs. To test this hypothesis, we will use NextGen sequencing to identify mutations (point mutations, small indels, fusion events) and alterations in expression in the transcriptome and/or exosome of human plexiform and intraneural neurofibromas and MPNSTs. We will identify neurofibroma-and MPNST-specific mutations by verifying that candidate mutations are absent in the transcriptomes of normal sciatic nerve and traumatic neuromas, SNP databases and published human genomes. Our long-term objective is thus to comprehensively analyze the genetic abnormalities promoting neurofibroma and MPNST pathogenesis, define molecular subtypes of these tumors and use this information to identify candidate therapeutic targets in each tumor subtype.
Novel whole genome NGS identification of somatic variants in non-small cell lung cancer
Peter J Tonellato, PhD, Ming You, PhD.
Advances in high throughput next generation sequencing technologies provide a new means to simultaneously interrogate the entire spectrum of somatic molecular variation providing new evidence for the type, molecular mechanism, and treatment targets in non-small cell lung and other cancers whose diagnosis and treatment options remain limited.Lung cancer is the leading cause of cancer mortality in the United States. Despite undergoing curative surgery, a significant percentage of patients with early-stage, node-negative non-small cell lung cancer will develop recurrent disease and die. As stage alone is not enough to predict recurrence in any one individual, more sensitive predictive biomarkers are desperately needed. If patients with early-stage NSCLC who are likely to recur could be identified, trials of adjuvant therapy could be targeted to this group where a greater benefit may be obtained, ultimately improving survival. Two labs, Ming You’s, MCW Cancer Center and Peter Tonellato’s, UWM have joined to pursue new methods and computational approaches to analyze and predict the functionally important molecular variants in NSCLC.We propose to develop a whole genome NGS analysis computational pipeline implemented on the UWM computational cluster and consisting of a modular collection of software tools to parse, manage, analyze and store de-identified, individual whole genomes in a replicative and computationally efficient manner.At this time, there is no stable whole genome NGS comprehensive analysis platform for use in the MCW Cancer Center, this proposal seeks to establish and test such a pipeline, demonstrate its validity to produce replicated results in a computationally optimized manner, and simultaneously, seeks to detect and predict the full collection of whole genome somatic variants (SNPs, indels, Copy Number and Structural Variants) in a Type I cancer research Pilot study.
Biomedical informatic analysis of the RNA of primary cells and tumors generated arising in conditional XRCC4 and p53 deficient mouse background
Charles Murphy, Erik Gafni, Parameswary Muniandy, PhD, Himanshu Sharma, Peter J Tonellato, PhD, Catherine Yan, PhD
Combined inactivation of p53 and LIG4, or the Lig4 protein-interactor XRCC4, leads to cancer development in mice in virtually every mouse cell type we have tested. Besides studying the fundamental role of NHEJ proteins in hematopoietic stem cell and the immune system, we have developed several mouse models of human cancers based on conditional inactivation of the non-homologous end-joining NHEJ DNA repair gene XRCC4 and p53. We hypothesize that in addition to known mutations, unknown "driver" mutations cooperate with DNA damage response (DDR) to promote the pathogenesis of the cancers that develop in our models. To test this hypothesis, we will use next generation sequencing and microarray technology to identify mutations (point mutations, small indels, fusion events) and alterations in expression in the transcriptome of the primary mouse cells of origin and cancers.We will perform miRNome and LincRNome analysis to investigate non-coding-mRNA regulations, which we anticipate will serve to identify common and tumor cell type specific driver mutations. We anticipate comparative analysis of the identified mutations to human datasets will identify new, unknown human driver mutations.
Zebrafish NGS Computational Resource
David Petering, PhD, Peter Tonellato, PhD.
Zebrafish (Danio Rerio) has been identified as a key organism to model human disease. Zebrafishhas demonstrated itself as a powerful model for studying a wide spectrum of complex human disorders (e.g. Cancer and Cardiovascular disease) as well as for understanding vertebrate development. Experiments can combine the advantages of using large numbers of synchronously developing embryos drawn from an unparalleled array of normal and mutant, transgenic or and knock-out fish lines.These characteristics immediately facilitate research that investigates individual susceptibility to disease and the genome, epigenome, and regulation of gene expression in relation to environmental exposure from the earliest life stage into adulthood.The Zebrafish NGS Computational Resource is to apply our combined expertise in next generation sequencing experiments and analysis to create a Zebrafish cluster computing workflow of the collection of bioinformatic tools needed to analyze and then interpret next generation sequencing (NGS) information.This pipeline will become UWM-NIEHS Centers’ shared resource with the zebrafish and human environmental health community.The NGS computational resource will include state-of-the-art NGS analysis tools integrated into Galaxy (g2.bx.psu.edu) to create a standardized biomedical informatics workflow computational system.Galaxy provides a graphical user interface for reproducible Bioinformatics workflows supporting shared, collaborative refinement of the parameters, parsing and analysis of the typical NGS datasets supporting exposome experiments.Galaxy also supports distributed resource management systems (DRM) thus providing automatic execution of predefined workflows in batched mode on a wide variety of systems (e.g. LSF, Condor, Torque and SunGrid Engine).We will test the pipeline on both the UWM Computer cluster and on Amazon Web Services. Both options allow users to launch analysis in a few minutes with automatically scaled computational resources to fit the load of Galaxy's NGS sequencing analysis job queue.This system will also support biologists without detailed unix or programming skills but who have taken a practicum-based workshop (such as described below) to understand, utilize, and execute powerful unix-based software that would otherwise be inaccessible to them.
Public Health Genetics and Genomics
Methylmercury induced visual and neurodevelopmental deficits in zebrafish: The role of DNA methylation in the transgenerational inheritance of disease phenotypes
Thomas Achankunju, Michael J Carvan, PhD, Peter J. Tonellato, PhD
Developmental exposure to environmental pollutants such as pesticides, bisphenol A, dioxin and hydrocarbon compounds have been associated with the onset of adult diseases and transgenerational inheritance of the diseases. Our preliminary studies have identified that developmental exposure to MeHg is correlated to reduced visual startle reflex and altered the response of potassium ion channels of the bipolar cells of the retina in zebrafish. In addition, in our preliminary studies, we have demonstrated that the third generation of fish population also showed altered visual response and retinal electrophysiology as that of the first generation. The third generation is the first generation that was not directly exposed to MeHg but inherited the altered physiology from the MeHg exposed first generation. This is the first evidence of transgenerational inheritance of a phenotype due to developmental exposure to MeHg in any species. The transgenerational effect of MeHg has not been well identified in zebrafish model. The alteration of gene expressions involved in vision and molecular mechanisms behind the transgenerational inheritance of visual defects due to developmental exposure to MeHg are unknown. No studies have been conducted to identify the molecular mechanism of transgenerational inheritance of visual defects induced by MeHg exposure. In our study, we are investigating the gene functions altered in the third generation due to developmental exposure to MeHg in the first generation. The role of DNA methylation, an epigenetic change, in the inheritance will also be investigated in this study.
Water Quality Monitoring of Milwaukee Beaches for the Milwaukee Health Department (MHD)
John Hernandez, Chelsea Weirich, Todd Miller, PhD, Rui Du, Aurash Mohaimani, Peter Tonellato, PhD
Bacterial levels of Escherichia coli are the major determinant in closing public beaches; the decision to close beaches is a significant public health concern for the Milwaukee Health Department (MHD). Lake Michigan water samples were collected from three beach sites (Bradford, McKinley, and South Shore) on 63 days spanning the period from early June until late August of 2012. These water samples were assessed for E. coli levels by the University of Wisconsin at Milwaukee (UWM) and MHD; fecal coliform levels in the samples were also investigated by UWM. Environmental conditions, including weather, rainfall, algae content, litter, and wildlife were recorded for each beach on every date. Data has undergone extensive quality assurance, integration, and preliminary analysis. Future work aims to translate and load the data into a database for long-term storage, extensive analysis, and prediction forecasting. Further work will automate data integration, translation, and loading; a web network and interface will be constructed to provide access to elements from the database and forecasting framework.
Robust taxonomic development using 16s rRNA pyrosequencing fragments
Charles J Murphy, Ryan Newton, PhD, Sandra McLellan, PhD, Peter J Tonallato, PhD
Next generation sequencing technology, such as pyrosequencing, can generate large sequence datasets to estimate bacterial communities in biological samples. Pyrosequencing often uses specific genomic regions, such as the 16s rRNA gene, as stable taxonomic markers. The primary analysis is to estimate bacterial communities in pooled biological samples, but is complicated with the consideration of variable length sequence fragments. The rapid advance of pyrosequencing technology enables longer sequence reads, which poses the technical problem of correlating taxonomies between older technology data (shorter sequence reads) and newer technology data (longer sequence reads); where longer sequences and shorter sequences have overlapping regions. Methods to correlate bacterial communities between longer and shorter sequences are actively being addressed. Presented here is the Hybrid Analysis (HA) that estimates bacterial communities in pooled samples containing variable length sequence fragments. Initial testing of the HA algorithm are promising; further testing is required.