An HMM-based Optimal Multiple Testing Procedure for Genome-wide Association Studies

Zhi Wei
Computer Science Department, NJIT


Genome wide association studies (GWAS) interrogate common genetic variation across the entire human genome in an unbiased manner and hold promise in identifying genetic markers with moderate or weak effect sizes. However, most conventional testing procedures ignore dependency among markers and suffer from severe loss of efficiency in GWAS. In this talk, I will present a data-driven testing procedure (PLIS), which employs hidden Markov Models to exploit dependency information among adjacent SNPs. PLIS is shown to control the false discovery rate (FDR) at the nominal level while have the smallest false negative rate (FNR) among all valid FDR procedures. By re-ranking significance for all SNPs with dependency considered, PLIS gains higher power than conventional p-value based methods. Simulation results and the application to a GWAS T1D dataset demonstrate that our proposed approach has better reproducibility and yields more accurate results. Some extensions will be discussed at the end.