P values and small sample size

Sudhi_Upadhyaya · June 3, 2023, 5:00pm

My apologies, if this topic has been discussed serval times already. In a frequentist approach when fitting a simple linear model on a dataset with sample size say 100 subjects , 6 covariates , each covariate with 3 levels, does p value matter ? How much importance should be given to p values , especially when p > 0.05 in this scenario.

f2harrell · June 4, 2023, 12:25pm

p > 0.05 seldom means very much in any case. But in this case with a very small sample size, if using a frequentist approach I would emphasize confidence limits (compatibility intervals) much more. Note that large p is interpreted as at present we do not have sufficient evidence to say that the data are incompatible with a supposition of no effect, the short version of this being “get more data”. Bayesian approaches on the other hand provide evidence in favor of any assertion.

When the sample size is small, not using any extra-data information is especially problematic, even to the extent that a flat prior that merely disallows the “wrong” direction of a regression coefficient will improve the analysis.

Sudhi_Upadhyaya · June 4, 2023, 6:08pm

Thanks Frank, this makes perfect sense to me.

technocrat · June 10, 2023, 8:24am

One of the truly hard things to do without really a lot of reminders like this is to go with “OK, so I still don’t have a good answer, just another question.” Making good use of disappointments is crucial because there are so many more disappointments in data than unexpected treasures.

Sudhi_Upadhyaya · June 13, 2023, 3:00pm

@technocrat , it depends on how one defines “disappointment in data”,

technocrat · June 20, 2023, 6:46pm

I had in mind the reaction “I didn’t find what I hoped to,” which is the common experience leading to motivated reasoning and dodges such as p-hacking.

Sudhi_Upadhyaya · June 20, 2023, 8:16pm

@technocrat , I agree.