Multiplicity in AIM-NIVO trial, uncontrolled by statistical discipline

I’ve come across an NCI-sponsored trial featuring abundant multiplicity, uncontrolled by any statistical or other formal discipline.


As recognized by a 2018 Nobel Prize, the immune checkpoint inhibitors (ICIs) represent a marvelous advance in the treatment of cancers. Unsurprisingly for drugs that ‘release the brakes’ on immunity, the characteristic adverse effects of ICIs are autoimmune in nature. These immune-related adverse events (irAEs), although generally less common (and perhaps less severe) than the adverse effects of traditional cytotoxic chemotherapy, do present special concerns for people with cancer who already have pre-existing autoimmune disease. Accordingly, ICI trials have typically excluded patients with autoimmune conditions, leaving oncologists with little evidence to guide the use of these agents in such patients. An ongoing multi-center phase 1b trial (“AIM-NIVO” conducted by the NCI’s Experimental Therapeutics Clinical Trial Network (ETCTN) is investigating the use of ICI nivolumab in the setting of autoimmune diseases at varying levels of severity.


AIM-NIVO is designed to enroll as many as 26 separate cohorts defined by ordinal [mild/moderate/severe] severity levels within each of 9 autoimmune conditions. (The ‘missing’ 27th cohort is due to ‘moderate’ and ‘severe’ rheumatoid arthritis being collapsed into 1 level.) Within each autoimmune condition, working upward from lower to higher severity, a cohort of up to 12 patients at each severity level will be enrolled. Enrollment will stop early within a cohort if observed rates of dose-limiting toxicity (DLT) exceed specified thresholds. (As pointed out in this PubPeer post from May, the cohort-wise operating characteristics tabulated in the trial protocol are likely wrong. In the present post, however, I’m dealing entirely with the ‘orthogonal’ matter of multiplicity across the cohorts of this trial.)

Operational Management

A touted feature of this trial is its organizational structure, incorporating “teams of experts”:

To the statistical eye, this diagram invites questions about how best to share information across the autoimmune conditions in this trial. As planned, all such information sharing appears to be entirely managerial, such that the study “will be very closely monitored throughout its conduct with regular safety calls” but without formal statistical supports. This is despite the fact that the investigators advance a hypothesis which potentially inverts the presumptive relationship between disease severity and irAEs, thereby inviting formal consideration of confounding factors such as concomitant immunosuppressive therapy:

Lessons or Examples from Other Trials?

Datamethods’ denizens seem oriented more toward Phase III and IV (post-marketing) trials. Might those trials hold any lessons for this Phase 1b trial? Can anyone point to an exemplary late-phase trial prospectively designed for information sharing across heterogeneous arms, especially in the presence of a hypothesis like #7 above? Are there useful analogies with the treatment of center effects in multi-center trials, or with study heterogeneity in meta-analysis?