Not randomized and no intervention assigned/delivered - what next?

Good afternoon all. I’m new here!! @f2harrell thanks for the suggestion to come on in!

We’ve had an unusual occurrence in relation to the deployment of a survey - and could use some advice.

Here are the details of what was intended:
The online survey will be programmed into REDCap and conducted among ~3200 participants using XXX, a national survey research aggregation firm. A selection of XXX’s panels of potential participants will be shown basic, non-content information about the survey (i.e., length of time to complete and a compensation amount). These members of XXX’s database are autonomously searching for survey opportunities and will self-select to learn more about this survey opportunity. Those who self-select will be sent to REDCap, where they will view an Introduction page that describes the purpose of the study and expectations of those who participate. Those who choose to continue from there will be redirected to the anonymous Demographic screening form. Those who choose not to participate will be thanked for their time but will not continue to the Demographic screening form.

Participants will be randomized to one of four conditions defined by the wording used (community informed versus traditional) and the method of presentation (video versus text). To minimize the potential for imbalance on key covariates, randomization will be stratified by:
● Health Literacy: High (Extremely/Quite a bit confident filling out medical forms by myself); Low (Somewhat/A little bit/Not at all confident filling out medical forms by myself)
● Educational Attainment: Less than High School, High School Diploma or Equivalent, Some College/Technical School or Greater
● Age: Less than 55, 55+

What happened:
There was an unusual technical glitch that allowed 328 participants to enter into the survey without randomization and without any delivery of intervention. This had to do with the connection of an external module in REDCap that controls the randomization/stratification and then the insertion into one of the 4 arms for the intervention delivery. The “bucket” capturing higher educational attainment was “full” and there wasn’t a stopgap in place to trigger an out. So, these participants were skipped through and pushed straight into answering the outcomes survey.

Now, the question here is what to do.

  • If the participants had been randomized, but then just happened to receive no intervention, that could proceed via ITT. But, there was no randomization.
  • I would normally be jumping up and down to say we don’t just “remove” or “replace” them – because that would be more post-randomization exclusion. But, that’s not the case here, because they didn’t properly randomize.

I’ve tried to think through a few options:

  • Redeploying those slots in the survey to capture an additional 328 participants. However, if we do this, should it be general and just “open” to all comers – or should we aim to capture 328 additionals with the higher level of educational attainment, because the “pass through” differentially hit that subgroup over time (it wasn’t that 328 patients in a row just happened to be in that category, it was a cumulative effect). But, note, we are not trying to achieve a representative sample on education. In fact, we are oversampling for low education, so aiming to target 328 with higher education may seem counter-productive, but there could be a data integrity or statistical impact element to consider.
  • Do we randomize post-hoc and then proceed via ITT (though we know no intervention was delivered)?
  • Do we take the loss against the planned sample size because of the unforeseen complication?

Appreciate any thoughts or ideas!!


Hi Cheryl,

Good question. I see no issue in simply excluding the participants, as they didn’t undergo randomisation (so it isn’t post-randomisation exclusion). You can then replace them with 328 new recruits to get back up to the planned sample size. If you pre-specified quotas on educational attainment, then you’ll want to oversample new participants with high attainment here. But if you didn’t (and you’d actually prefer more participants with low attainment), you can recruit them as you did before.

However, I’d do a check to see if there were other issues with the randomisation:

I’d plot the cumulative number of participants randomised to each condition (on y axis, different lines for each condition) against time (on x axis), with separate plots for each of the strata. I’d add a dotted vertical line at the time point where participants in one stratum stopped being randomised. If there are no major issues with the randomisation, the four lines should stay relatively close to one another over time.

Note: had the 328 ‘problem’ participants been randomised (by the computer program) but not received the intervention, it would still be fine to exclude them (i.e., the issues with these problem participants in no way invalidated the results for participants who had already been successfully randomised to receive their intervention).

I hope that helps. If anyone else has a different perspective, please let me know.


1 Like


I agree with @Harry_TB, both in terms of excluding these subjects, and to also check your randomization procedures generally, to ensure that there are no other integrity issues.

Unless you have a protocol or SAP that dictates otherwise, the default ITT cohort is the group of subjects that were actually randomized. This group of subjects, which is roughly 10% of your total, does not meet that definition.

In clinical trials, it is also common that there can be a modified ITT (mITT) cohort, which is the group of subjects that were randomized AND received at least one dose of the study defined treatment. From what I can tell, these 328 subjects did not receive the “treatment” and were only assessed for the outcome.

So, they essentially fail being included in the ITT, and in the mITT cohort, further supporting their exclusion.

Barring the identification of other issues with the enrollment and randomization procedures, I would look to replace these subjects with new enrollees, where the screening and randomization procedures do not allow for more subjects in the strata there were already filled based upon your a priori quotas.

One other potential consideration would be the study time and cost impact to replace these subjects now on a post hoc basis. If that is not material, then proceed with replacing them as above.

If the process of replacing them will have a material and negative impact on your study vis-a-vis time and cost, then consider the tradeoffs associated with doing so, versus just proceeding with the cohort that you have, since you have only lost 10%.

Those decisions may be impacted upon by whether or not your study was formally powered with a priori null and alternative hypotheses, along with a commonly utilized bump in the calculated sample size to account for potential subject loss. That bump is commonly around 10-15%. Or, was the target sample size a “convenience sample”, where there were some educated a priori guesses made about how many subjects you could enroll within an acceptable time period, and within your study budget?

One other thing to keep in mind, again, perhaps influenced by your study protocol and any SAP, is the general notion of “analyze as you randomize”, typically attributed to Fisher. Since you have a stratified randomization approach, if you have not already, given strong consideration to analyzing your outcomes using multivariable models, where your multiple stratification factors, along with the assigned treatment groups, are your independent variables.

See as one reference on that topic.


@Harry_TB and @MSchwartz - thank you both for the input!! This has been very helpful discussion. Will also be sure to revisit our randomization to ensure there were no other issues that crept in.