Advice for Pooling Two RCTs in Studying Risk Factors for Chronic Pain

I am a PhD candidate examining the long-term outcomes of ICU survivors, particularly chronic pain. I am after some advice about pooling studies and then analysing data. I have access to the data of 2 multicentre RCTs, both with 6 month follow-up using the EQ-5D. I was hoping to pool the data and look at risk factors for the dichotomous outcome of chronic pain (yes/no) from the EQ-5D pain domain, potentially using logistic regression (or CART). A total of 10,800 patients were enrolled in total.

My questions are:

  1. What is the best way to use the pooled data from the 2 trials. There is some overlap between the 2 studies in terms of inclusion criteria. I’ve asked around and am getting a few different answers ranging from doing nothing, to propensity-matched (by trial) to inverse-probability.

  2. What would your advice be for analysis of risk factors? I haven’t worked out the details or the number of variables, but it won’t be more than 10 or 20. The sample size should be large enough (even with conservative EPV values).

I appreciate everyone’s time.



use one dataset to develop the model and the other to test it? ie i don’t think this is a meta-analysis exercise when we typically want to estimate a treatment effect

1 Like

Thanks Paul for your suggestion. One of the trials is looking at septic patients and the other is looking at fluid resuscitation. Sepsis is a potential risk factor for developing chronic pain, so using one trial to generate the model may not account for this. This is why using propensity-matching for the trial may work. Any thoughts?

I don’t think we can answer this well until we have a little better handle on this:

Deciding how to pool the data from trials is somewhat contingent on what you’re trying to accomplish, and (with all due respect) a lot of “look at risk factors” papers are rather confused as to their own primary aim.

Would you describe your goal as:

  1. trying to build some sort of risk score or prediction model that can estimate a patient’s risk of chronic pain based on a few easily-measured risk factors?

  2. trying to identify / estimate the effects of a few specific risk factors that are of particular interest so you may identify the most important targets for intervention?

These goals sound similar and many researchers fail to differentiate between the two - they simply perform multivariable regression modeling, shove all variables with significant p-values into a model and call it a day. Before starting your project, I think this needs to be clearly defined: what exactly do you hope to accomplish by “looking at risk factors” in this project?

There’s a third, more complicated option: you’re trying to determine which risk factors are most important in determining which patients benefit (or fail to benefit) from the intervention.

Once you have a better handle on that, we may be able to give better advice on pooling the data from the trials (if that is even advisable - it may not be).


I think you need to answer the above question first, but I’m very skeptical that propensity-score matching has any useful role to play here given what you’ve said so far.

1 Like

Thank you so much for your reply. I appreciate it. For some background, my PhD is looking at the incidence, risk factors and burden of chronic pain in ICU survivors. The analysis of the above studies is to determine the non-modifiable risk factors, with the hypothesis being that it is not so much what the patient is admitted with that is associated with developing chronic pain (i.e. it is potentially what we are doing to the patient that is resulting in this). In addition, being able to communicate to the patient that they are at a higher risk of developing chronic pain may enable referral to an ICU follow-up clinic or a chronic pain clinic.

This leads on to an observational study looking at characterising chronic pain (location, type of pain, etc) and looking at modifiable risk factors (i.e. analgesia management, opioid type and dose, sedation, immobility, etc) that can then be targeted for an interventional trial.

To answer your question about goals for analysing this data, it would fall more toward goal 2 (identifying effects of specific risk factors of interest for potential targets for intervention). Although the identified risk factors may be non-modifiable presently, treatments may be developed in the future that may target these variables.


Excellent, thanks for responding.

A few additional brainstorming question(s) that come to mind:

  1. whether the randomized treatments in the respective RCT’s are important in this analysis, or if you intend to essentially use them as cohort studies.

  2. were the randomized comparisons the same in both trials? If not, what were they, and how different are the respective agents with respect to the long-term outcome of pain?

  3. how different are the study populations? Do you expect the risk factors associated with pain to have similar relationships in one trial vs the other?

This is feeling very similar to when I’m trying to get information out of my kids! For that, I apologise. I’m a novice researcher and have rudimentary statistical knowledge, so this is a great learning opportunity for me. Thanks for your prompt reply. To answer your questions:

The interventions in each of the trials aren’t important- they were IV fluids and steroids and neither are considered risk factors in the development of chronic pain. I would like to treat the trials as cohort studies.

The 2 trials that I’m looking at using the data for are the Chest Trial (Myburgh et al, 2012, NEJM) and the Adrenal Trial (Venkatesh et al, 2018, NEJM). As I mentioned, the interventions are not considered risk factors for developing chronic pain. The patient population from the Adrenal trial are those patients that are suspected of having sepsis. Sepsis is possibly a risk factor for chronic pain. The population from the Chest trial were those patients requiring a fluid bolus and 30% of them had sepsis.

The study populations do differ somewhat. Those from Chest were all-comers to ICU that required fluid boluses. The Adrenal study required the patients to be mechanically ventilated with a probable diagnosis of sepsis. This is reflected by the respective APACHE2 scores (severity of illness) of 17 for Chest and 24 for Adrenal. Age and gender are comparable. Chest had about 10% more surgical patients than Adrenal (40% vs 30%). I do expect the risk factors for pain to have similar relationships in both trials. As I mentioned in the previous post, I think it is more about what we do to the patients, rather than what they come in with (with the exception of trauma and burns), but this is a hypothesis. We use too much sedation, not enough multimodal analgesia and patients don’t mobilise enough. (Sorry for the rant, it’s the reason I’m subjecting myself to a life oif research!). The patients form the Adrenal trial were sicker (a potential risk factor) and all on mechanical ventilation (which also may be a risk factor), but 30% of patients for Chest could have been enrolled in Adrenal.


No problem. I’m just walking through the normal steps that I would if someone approached me with this problem to begin a statistical consultation.

If I may ask, what sort of PhD program are you in? Do you have access to any statistical support or guidance, or is it your advisor’s expectation that you simply should be able to do all of this yourself? There are some great folks posting here and you will get good advice, but it’s even better if you can build a good collaborative relationship with a local statistician who has time and ability to meet with you regularly or semiregularly.

You are in a wonderful position by having access to data from two major RCT for your dissertation; even if you’re not leveraging the randomization for your study, the data quality is likely far better than what many PhD students get to work with. I too benefited from writing a dissertation principally using secondary analysis of a major RCT.

I think it’s okay to treat the RCT data as though it came from a cohort study, but I suspect that at the very least you should check for any interactions with treatment assignment in each study before proceeding comfortably with that assumption. Would love to hear what others think.

Thanks for the detail. I think this will be important context about the generalizability of your results. I would love to hear others’ opinions (both statisticians’ and clinicians’ opinions) about pooling these 2 trials for your purpose.

Will try to comment more if I have time later, but I’d love to see some other folks get involved.

1 Like

I’m in a research-based PhD program with the University of New South Wales. My dissertation will be by thesis (not publication). My supervisor has no expectation about how I go about this, it is more for my own education- I want to do as much as possible to understand how to do things correctly. To help with this, I am part way through a Masters of Medical Statistics (I haven’t made it to regression analyses yet, hence my post). I have some statisticians that have been useful and am trying to find one that I can build a long-lasting collaboration with. It fascinates me about the variation and rationale of different approaches.

That makes sense.

Thank you do much for your time delving into this. It’s greatly appreciated. I’d also love to hear what others think.

Is there anyone else who would like to add to Andrew’s comments? It’d be greatly appreciated.