@f2harrell listed the two papers i would have mentioned ie the Bakal & Armstrong paper, and pocock’s win-ratio. Just yesterday I reviewed a paper on the topic. I believe it’s open review so my comments will eventually appear somewhere. I have certain opinions about it. I don’t like the Bakal & Armstrong approach; I find it ironic that while we are expressing concern over reproducibility we watch researchers nominate ad hoc, arbitrarily consructed endpoints as the primary outcome of their study.
one issue is that the relative importance weights are often inversely related to incidence eg death. Thus you lean on the weakest outcomes (which makes sense if like Felker & Maisel you are trying to make phase II results look more like phase III results). But power can then become inadvertently sensitive to study duration which is dictated by extraneous factors (we showed this graphically here: Influence).
an interim reassessment of power is surely prudent because composites have so many moving parts. But that presents another problem: someone mentioned this nejm paper on twitter where they did an interim analysis: nejm paper My contention would be that you are not testing the same hypothesis at the interim analysis and study completing, because the Influence of the component outcomes varies across the study period
Re “joint frailty models that account for weight and repeat events”, we did something (Brown & Ezekowitz, among @f2harrell 's list) but weights then become unnecessary because you are not amalgamating outcomes, that’s one of the advantages i see. I cannot see that multivariate modelling will not overtake composites, eventually; complexity usualy wins in the end but simplicity is so compelling.