Suppose I want to measure how the effect of a binary treatment varies by a continuous measure of pre-treatment risk. I’m evaluating two possible strategies:
Strategy 1: Simple case-control design and then flexibly model the interaction between treatment and pre-treatment risk using a penalized spline.
Strategy 2: Stratified sampling by discretized pre-treatment risk variable to balance sample across risk groups of particular interest.
I am constrained in terms of sample size and want to estimate the varying treatment effect as reliably as possible. I don’t a priori know which groups of pre-treatment risk are of importance. Strategy 1 is my current preference because it doesn’t require me to make a decision about how to discretize the continuous pre-treatment risk prediction. Yet Strategy 2 may be more powerful for the reasons that people usually argue that stratified sampling increases statistical power.
I need help deciding which of these strategies (or other strategies) to go with.