Quefeng Li
University of North Carolina at Chapel Hill, USA
Posters & Accepted Abstracts: J Biom Biostat
In the era of big data, as the individual patient data (IPD) become more accessible, integrative analyses using IPD from multiple studies are now extensively conducted to identify prognostic genes. It has been recognized that genes do not work alone but through pathways. In this talk, I will present a general statistical framework for pathway and gene identification from integrative analysis. Our framework employs a hierarchical decomposition on genes� effects followed by a proper regularization to identify important pathways and genes across multiple studies. Asymptotic theories are provided to show that our method is both pathway and gene selection consistent. We explicitly show that pathway selection consistency needs milder statistical conditions than gene selection consistency, as it would allow false positives/negatives at the gene selection level. Finite-sample performance of our method is shown to be superior to other ad hoc methods in various simulation studies. We further apply our method to analyze five cardiovascular disease studies.
Email: quefeng@email.unc.edu
Journal of Biometrics & Biostatistics received 3254 citations as per Google Scholar report