Our group is interested in the analysis and prediction of complex traits and diseases using genetic (integrating pedigrees, genomics, and other omics) and environmental information. Our research involves methods, software development, and applications in human health, plant and animal breeding. Most of us are affiliated with the Department of Epidemiology and Biostatistics at Michigan State University.
Projects
Genomic Analysis and Prediction of Complex Traits. Development and evaluation of methods and software for analysis and prediction of complex traits using high-dimensional genomic data (e.g., SNPs, genotyping by sequencing, and other types of sequence data). Our research in this area has focused on the use of shrinkage and variable selection in parametric models, as well as on the use of some semi-parametric methods (e.g., RKHS).
Genomics x Environment. Development of methods for integrating high-dimensional genomic and environmental data in a unified framework. We have developed methods that can model interactions between high-dimensional marker panels and high-dimensional environmental covariates. These methods were originally developed and tested with data from wheat trails. We are currently extending some of these methods for analysis of complex human traits and diseases.
Integration of Data from Multiple Omics Layers. Development of models and software for integrating high-dimensional multi-layer omics data. Our focus is on methods that can integrate whole-omics profiles and can model interactions between two or more high-dimensional predictor sets (e.g., genome-by-methylome interactions). We are currently working on using these methods for prediction of breast cancer outcomes and in plant omics applications.
Software development for analysis of big omics data. We have developed several R packages for genetic analysis using pedigrees, genomes and other omics (see software below for further details).
Genomic Analysis of Obesity and Response to Exercise. We maintain an active collaboration with researchers from the TIGER (Training Interventions and Genetics of Exercise Response) study, developing and implementing methods for the identification of genetic factors influencing Body Composition and Response to Exercise Intervention.
Software
BGLR. The Bayesian Generalized Linear Regression R package implements a variety of shrinkage and variable selection methods. The package can be used with whole-genome data (e.g., SNPs, gene expression or other omics), pedigrees and non-genetic covariates, including high-dimensional environmental data. ArticleCRANSource Code
BGData. A suite of R packages to enable analysis of extremely large genomic data sets (potentially millions of individuals and millions of molecular markers). ArticleCRANSource Code
pedigreemm. An R package for analysis of complex traits and diseases using generalized linear mixed models using likelihood methods. ArticleDocumentationCRAN
pedigreeR. R functions related to pedigrees. Source Code
MTM. Implements a Bayesian Multi-Trait Gaussian models with user defined-(co)variance structures. DocumentationSource Code
Areas of Interest: Genome-wide association and genome selection studies for complex traits in potato, genetic engineering in potato using CRIPRS/Cas9 technology