# data analysis

formal statistical tests and inference data reduction data reduction (principal components, etc.), clustering, unsupervised learning accuracy accuracy and information measures, discriminaton, calibration probability Probability theory, meaning, and application exclusive of statistical tests, etc. generalizability Generalizability of studies and statistical inferences, sample representativeness, target population bayes Bayesian data analysis, modeling, inference comparative methods comparative performance of statistical analysis methods and predictive modeling approaches uncertainty Quantifying uncertainty, displaying uncertainty, estimation of uncertainty, incorporation of uncertainty into decision making, etc. This includes but is not limited to confidence intervals, standard errors, Bayesian credible intervals, and sources of uncertainty. causal inference Methods and approaches to causal inference data problems statistical approaches dealing with missing data and measurement error machine learning machine learning, exclusive of traditional statistical models variable selection Selection of predictive features in multivariable modeling, one-at-a-time screening of variables, and the cost of feature selection compared to using fuller models, possibly with penalization (shrinkage; regularization). reporting This subcategory relates to how results of data analyses should be reported, for example which summary statistics should be reported for a logistic regression model. models Formulation, parameter estimation, and interpretation of specific statistical models modeling strategy General model specification issues, nonlinearities, interactions and heterogeneity of treatment effect, avoiding categorization, how to sequence multiple steps (which may involve multiple imputation and data reduction) model validation model validation and interpretation descriptive descriptive and exploratory data analysis, hypothesis generating more than confirmatory analysis