SNP prioritization of multiple outcomes

RS47 · May 10, 2019, 6:18pm

I have ~1600 single nucleotide polymorphisms (snp) that I have correlated with 8 clinical inflammatory bowel disease outcomes that are surrogates for worse disease (not independent) from ~800 patients. I would like to now prioritize the snps. I have prioritized the snps based upon the sum of ranks of all 8 outcomes (lowest sum, highest priority) based upon each p-value for each outcome for each snp, however I was wondering if there was a better statistical test that would generate a combined p-value and/or was more rigorous? Of note, the outcomes are of similar importance without a clear primary outcome. Thank you.

f2harrell · May 10, 2019, 9:12pm

Here’s one approach. For each SNP compute the logistic model likelihood ratio \chi^2 statistic predicting the SNP from all 8 outcomes (i.e., reversing roles of outcomes in input). Rank the 1600 SNPs by this \chi^2. Repeat the entire process 200 times taking samples with replacement from the original data to compute bootstrap nonparametric confidence intervals for each SNP rank, similar to what is done in Section 20.2.3 of BBR.

This approach allows every SNP to have different outcome weights. If you want to enforce a commonality of outcome patterns across SNPs, you might replace the \chi^2 with the Wilcoxon statistic or c-index computed from relating the mean rank across outcomes with the SNP categories.