I am inspired by a tweet to ask this question. I cannot retrieve it. We have strategies to make provisions for missing data at the proposal stage. Do we have strategy to make provisions for outliers such that we will not be hit in a sudden when the data come back? Are there any well-received guidelines for handling outliers?
This is going to depend on how you intend to analyze the data.
Bayesian methods handle this issue by specifying probability distributions for all parameters. There is a good example on how Bayesian methods handle outliers in BBR Session 5 at around the 25:00 mark. This is an attractive solution if a credible prior is used.
Frequentist procedures are numerous:
Nonparametric methods: If you were going to use a 2 sample t-test, the 2 sample Wilcoxon-Mann-Whitney is the preferred alternative. Check out the relevant sections of Biostats for Biomedical Research for more details. Nonparametric tests are special cases of the more general proportional odds model. This is pretty hard to beat in terms of simplicity and accuracy.
Robust parametric procedures. They all entail some sort of emphasizing the weight of observations closer to the middle of the distribution relative to the tails. The robust method should be pre-specified before the data are collected.
Adaptive estimation: You can estimate the weights by fitting a distribution to your data, then conducting the analysis by using a permutation method. The permutation method will guarantee that the size of the procedure (\alpha level) is maintained. Tests and interval estimates can be calculated. This can squeeze out a bit more power relative to nonparametrics. Thomas O’gorman is one of the scholars who has published in this area.
Adaptive Tests of Significance Using Permutations of Residuals with R and SAS (link)