Where are the exceptional responders?

In Sep 2016, FDA granted accelerated approval to exon-skipping drug eteplirsen for the subset of Duchenne muscular dystrophy (DMD) with a mutation amenable to exon 51 skipping. The basis for this approval was that initial studies had “demonstrated an increase in dystrophin production that is reasonably likely to predict clinical benefit in some patients.” (Although this increase was to the order of only ~1% of normal levels, this is nevertheless comparable to the quantities of dystrophin found in patients with the considerably milder Becker form of muscular dystrophy [3]. So the demonstrated levels of dystrophin production would on their face seem capable of converting DMD phenotypically to BMD.)

In the nearly 8 years since approval, however, the required confirmatory clinical trial has still not been completed [2]. The sponsor has instead produced “real-world evidence” of benefit using claims and EHR data [3]. To date, trials of eteplirsen have enrolled several hundred participants.

Even making allowances for yet-undiscovered factors that might be undermining the hoped-for efficacy of eteplirsen in many or even most of the trial participants, thereby delaying acquisition of a traditional ‘confirmatory’ statistical signal, would it not be reasonable at this point to ask that at least a few exceptional responses [4] had been observed and reported?

Alfano et al [5] report on identical-twin boys who lost ambulation about 36 weeks after randomization to eteplirsen, but then kept pace with the remaining 10 study participants on upper-extremity, cardiac and ventilation measures. This offers a hint at the sort of thing I might have hoped to see by way of a case report. What would be persuasive to me as an ‘exceptional response’ would be a participant who demonstrated loss of ambulation at a relatively early age, either before initiating eteplirsen or shortly afterward [presumably, before the drug had exerted its full effect on dystrophin production], but then exhibited a distinctly BMD-like clinical course that left no doubt in an experienced clinician’s mind that the very character of the patient’s disease had been altered. (The clinician would say e.g., “I have never seen a boy with DMD lose ambulation at such an early age, then go on to preserve upper-extremity functioning for this long.”)

I’m asking for just 2 such case reports. Am I being unreasonable?

  1. De Feraudy Y, Ben Yaou R, Wahbi K, et al. Very Low Residual Dystrophin Quantity Is Associated with Milder Dystrophinopathy. Annals of Neurology. 2021;89(2):280-292. doi:10.1002/ana.25951

  2. Bendicksen L, Kesselheim AS, Rome BN. Spending on Targeted Therapies for Duchenne Muscular Dystrophy. JAMA. 2024;331(13):1151. doi:10.1001/jama.2024.2776

  3. Iff J, Zhong Y, Tuttle E, Gupta D, Paul X, Erik Henricson. Real-world evidence of eteplirsen treatment effects in patients with Duchenne muscular dystrophy in the USA. J Comp Eff Res. 2023;12(9):e230086. doi:10.57264/cer-2023-0086

  4. Marx V. Cancer: A most exceptional response. Nature. 2015;520(7547):389-393. doi:10.1038/520389a

  5. Alfano LN, Charleston JS, Connolly AM, et al. Long-term treatment with eteplirsen in nonambulatory patients with Duchenne muscular dystrophy. Medicine. 2019;98(26):e15858. doi:10.1097/MD.0000000000015858

4 Likes

David - I don’t think you’re being unreasonable.

If a therapy is highly efficacious and causes net improvement in a patient’s clinical state, we can often detect that efficacy easily at the level of individual patients. The improvements are obvious to everybody- patient, primary care physician, and specialist. Examples include:

  • patients with severe asthma or inflammatory bowel disease who are finally able to get off prednisone and stay out of hospital after starting a biologic;
  • patients with very low ejection fractions whose EF and quality of life improves quickly and dramatically with modern goal-directed medical therapy for CHF;
  • patients with decades of severe psoriasis whose skin finally clears after starting biologic therapy;
  • patients with hemiplegia from a large vessel ischemic stroke who walk out of hospital a week after emergency thrombectomy…

In contrast, it’s a LOT harder to detect efficacy at the level of an individual patient when the therapy doesn’t improve the patient compared with his baseline clinical state, but rather works by slowing his disease progression. In fields where cure is often impossible (e.g., oncology and neurodegenerative diseases), a lot more expertise is usually needed to detect therapeutic efficacy at the level of individual patients.

For relentlessly progressive diseases where prognosis/time course is short and highly predictable/homogeneous (i.e., disease trajectories are very similar between patients), clinical anecdote can be very valuable/informative in gauging therapeutic efficacy. If ALL patients with a certain highly aggressive cancer tend to die within 3 months of diagnosis and if we then see a patient survive for 5 years after trying a new therapy, everyone knows they have witnessed something astonishing. In contrast, clinical anecdote is much LESS useful for gauging therapeutic efficacy for diseases with trajectories that are longer and more heterogeneous (e.g., Alzheimer’s disease, muscular dystrophy).

The problems with eteplirsen that you highlighted in your post overlap in many ways with therapies for other progressive diseases. Anyone who cares for patients with dementia can attest that it’s nearly impossible to tell whether existing therapies (primarily cholinesterase inhibitors) are doing anything at all in a given patient. The main problems are:

  1. We often don’t have granular “pre-treatment” documentation of a patient’s clinical trajectory. Given heterogeneity in disease trajectories between patients, we can’t detect a therapy-induced slowing of the rate of decline unless we have detailed and reliable information regarding pre-treatment disease trajectory;
  1. Our commonly-applied clinical monitoring tools are crude at best. These are not the same tools as those used in clinical trials to gauge clinical progression. In the “real world,” assessment of disease progression (or not) from one clinic visit to the next often seems to hinge on clinician/family “gestalt” more than any solid/objective evidence;

  2. Crude clinical assessment tools, combined with modest (at best) therapeutic efficacy, is a recipe for ignorance about whether a treatment is helping the patient in front of us at all. And yet, once the patient starts medication, everyone is afraid to stop it in case they “make things worse” [translation- in case drug discontinuation converts an (undocumented) less acute trajectory to an (undocumented) more acute trajectory].

This is the reality of dementia care today. Now imagine if we were to ask a publicly funded healthcare system to fund a switch from cholinesterase inhibitors to very costly newer therapies like amyloid-directed antibodies. All of the above problems would still exist, but now they would be superimposed on a bankrupt healthcare system, further backlogged imaging infrastructure (given the need for serial MRIs), and a few side effect-related deaths for good measure. Is it any wonder why companies marketing these therapies are getting push-back from physicians?

The desperation of families with loved ones who have fatal progressive diseases is heartbreaking and completely understandable. But patients with these diseases deserve better than criminally overpriced therapies with nearly-undetectable efficacy. Until pharmaceutical companies are able to develop treatments with truly meaningful clinical efficacy, we should expect public healthcare spending to be directed preferentially toward supportive care that might improve the quality of life of both patients and caregivers.

4 Likes

Thank you so much for your thoughtful reply, Erin.

This seems to me the crucial source of the clinical-scientific problem here. Pediatric growth charts, however, provide one example of how developmental trajectories generally can be usefully compared. Just as a child with onset of growth restriction may be seen to ‘fall off their growth curve’, a boy with DMD who initiated a successful treatment might be seen to ‘escape’ upward from his (downward) disease trajectory.

But the challenge I’m putting to the drug companies here is far less stringent than the requirement to measure response in every individual patient. I’m asking if they can demonstrate a definite response in any individual patient.

Although I’d like to advance the conjecture that this principle may be generically applicable to the whole class of fatal degenerative diseases you are pointing toward, including Alzheimer disease, I haven’t yet been able to produce a satisfactory argument for that grand thesis. In the particular case of eteplirsen, however, the conjunction of the following two principles seems to provide ample reason to expect at least a few exceptional responders:

  • The fact that eteplirsen can induce dystrophin levels characteristic of the considerably milder BMD
  • Duan et al. (2021) cite “the importance of genetic modifiers, whereby variations in genes involved in, for example, inflammation or fibrosis formation, can influence disease outcome” in both BMD and DMD.

Although there might be a temptation to think of Accelerated Approval as setting a lower bar, perhaps we ought to think of it in the opposite way: it creates high expectations that are fully capable of being disappointed. With all that genetic modification thrown into the mix, how can we not by now have a few exceptional responders?

Conversely, if we have indeed not seen any exceptional responders among hundreds treated, this starts to look like proof that some robust phenomenon is at work dashing our hopes for the drug.

3 Likes

I’ve been working toward a formulation that crystallizes the intuition here, in something like the way economists tend to do with their ‘toy models’. Here’s what I have thus far:

Let’s scope the discussion specifically to degenerative disease, in which we have a concept of disease activity A(t) as a function of time, which is the time-derivative of a cumulative state. In DMD, one could think of this as the rate of fibrofatty replacement (FFR) of inflamed muscle tissue. In Alzheimer dementia, this could be [conceptually, at least — I acknowledge doubts about the amyloid cascade hypothesis] the rate of amyloid β deposition. Lysosomal storage diseases may lend themselves to a similar concept, etc.

I think it important that this picture could be elaborated substantively in terms of stochastic processes, with activity A_t modeled as (say) a mean-reverting Ornstein-Uhlenbeck process, and disease progression captured in its time-integral S_t = \int_0^t A_u \text{d}u . But I won’t pursue that here.

What I’d like to focus on is characterizing the putative effect of a therapy as reduction of disease activity. Certainly, this accords with the concept of steroid use in DMD, which targets the inflammation that [presumably] lies immediately upstream of the FFR process.

As a first approximation, let’s treat this reduction as multiplication by a factor 1-\theta for \theta \in [0,1]. (Theta for therapy.) Thus, \theta = 0 is a null therapy, while \theta =1 corresponds to a cure. In a heterogeneous degenerative disease — muscular dystrophy makes an excellent example here, with its many different mutations generating a wide spectrum of severity — we have to expect substantial HTE. The population distribution of \theta then needs to be considered. For simplicity, I’ll posit that a randomly selected individual i has \theta_i with the 1-parameter distribution,

\text{P}(\theta_i < \theta) = \theta^\lambda.

Thus, for \lambda=1 we have \theta \sim \text{U}[0,1], while for \lambda \rightarrow \infty we get increasing concentration of mass near \theta \approx 1 (a cure), and for \lambda \rightarrow 0 we concentrate the mass near \theta \approx 0 (a marginal therapy).

In a situation where a sponsor thrashes thru a bunch of trials that never get reported :thinking:, outsiders can nevertheless draw some inferences about \lambda by using the lack of any case-report of exceptional response as a censored observation of \theta.

Suppose that any activity reduction exceeding \theta_c would be clinically evident (c for clinical or critical, say). Then every participant enrolled in one of these hidden trials and never highlighted in favorable a case report contributes a factor \theta_c^\lambda to the likelihood, and the likelihood for n such participants is (\theta_c^\lambda)^n = e^{n \ln\theta_c \cdot \lambda}.

Now \lambda \sim \text{Gamma}(\alpha,\beta) would yield a conjugate prior. That is, if we chose the prior

p(\lambda) = \frac{\beta^\alpha}{\Gamma(\alpha)} \lambda^{\alpha-1} e^{-\beta \lambda},

then our posterior is also Gamma-distributed:

\begin{aligned} p(\lambda \mid n) = p(\lambda)\cdot\theta_c^{n\lambda} & \propto \lambda^{\alpha-1}e^{-\beta \lambda}\cdot e^{-n \ln(1/\theta_c)\lambda} \\ & = \lambda^{\alpha-1} e^{-[\beta +n \ln(1/\theta_c)]\lambda} \end{aligned}

Indeed, we see \lambda \sim \text{Gamma}(\alpha,\beta_n) with \beta_n = \beta+n\ln(1/\theta_c).

Observe that \theta_c < 1 \implies \ln(1/\theta_c) > 0, so that we have here a \beta_n parameter increasing linearly with the number n of participants enrolled quietly in these never-reported trials. Moreover, the constant \ln(1/\theta_c) will be on the order of \frac{1}{2}, if e.g. a 60% reduction in disease activity would be clinically evident: \ln (1/0.6)\doteq 0.51.

Because the Gamma distribution’s \beta is an inverse scale parameter, \beta\rightarrow\infty concentrates the posterior mass toward \lambda \rightarrow 0 (marginality).

Incidentally, this term ‘marginal’ seems appropriate here on the dual grounds of its colloquial meaning “limited in extent, significance or stature” and its more formal statistical meaning — that any efficacy will only ever be detected in a marginal analysis averaging over many patients, and never clinically in any given individual.

2 Likes

Hi David

Is the gist of your post that you’re trying to estimate how much stock we should put in a drug sponsor’s promise that there could someday be an “exceptional responder” to a new therapy, even if the sponsor hasn’t been able to demonstrate an exceptional response after treating “X” number of patients?

If so, then couldn’t we just invert the title of the following article from 1983, changing it from:

“If Nothing Goes Wrong, Is Everything All Right?”

https://jamanetwork.com/journals/jama/article-abstract/385438#google_vignette

to

“If Nothing Goes Right, Is Everything All Wrong?”

2 Likes

Intriguing article! (It’s available here BTW for anyone immoral enough to use SciHub.) My hot take would be that I’m actually dealing with a case of zero denominator — drawing inferences from the unreportedness of results. And indeed this corresponds to the ‘inversion’ you suggest.

But I can see this piece deserves to be read with some care.


In terms of uses for this model, then certainly its prime target would be the sponsor’s rhetoric — and specifically the shifting goalposts aspect. (The shifting mode of the Gamma distribution indeed looks like a moving goalpost.) Initially, of course, there’s lots of hope that patients will benefit. But as time goes on, this morphs implicitly into the argument that we can’t be certain that patients aren’t benefiting to some [subclinical] degree. This kind of model may serve to mark approximately where that goalpost has moved at any point in time.

Beyond this, however the \theta_c parameter in this model might help focus attention on improving clinical assessment.


Addendum: I’ve read the Hanley & Lippman-Hand article properly, and enjoyed being reminded of that hoary old Rule of three (statistics) - Wikipedia. The psychologistic dimensions of their argument are also worth attending to. But in contrast to what I’m doing here, they confine themselves to inferences about a simple rate parameter. In the model above, I draw inferences about the whole distribution of therapeutic effect sizes. That is, rather than confining myself to estimating just the prevalence of effects above the \theta_c threshold, I’m trying to learn something about even clinically undetectable \theta_i \ll \theta_c effect sizes. Of course, such an effort rests on invoking some kind of Power law - Wikipedia principle (see also Manfred Schroeder’s lovely Fractals, Chaos, Power Laws), which I’ve done implicitly in setting up my model. I conjecture that other reasonable model set-ups would effectively recapitulate the linear growth of \beta(n).

2 Likes

Let me follow up now with some back-of-the-envelope calculations.

Writing \theta_p for the p-quantile of \theta, we have

p = P(\theta < \theta_p) = \theta_p^\lambda,

from which we obtain

\frac{\ln(1/\theta_p)}{\ln(1/p)} = \frac{1}{\lambda}.

Now, since \lambda \sim \text{Gamma}(\alpha,\beta_n), we know that

\frac{\ln(1/\theta_p)}{\ln(1/p)} = \frac{1}{\lambda} \sim \text{Inv-Gamma}(\alpha,\beta_n).

Because the \beta parameter of \text{Inv-Gamma} is a scale parameter (rather than an inverse-scale, as with the \text{Gamma} distribution), we now have that \beta_n \rightarrow \infty shifts our distribution to the right. This drives \ln(1/\theta_p) \rightarrow \infty and consequently \theta_p \rightarrow 0.

Given our interest (as noted above) in watching the goalposts move, it makes sense to focus on the mode of \text{Inv-Gamma}(\alpha,\beta_n):

\begin{aligned} \text{mode}\left(\ln\frac{1}{\theta_p}\right) = \ln\frac{1}{p}\cdot\frac{\beta_n}{\alpha+1} & = \ln\frac{1}{p}\cdot\frac{\beta + n\ln\frac{1}{\theta_c}}{\alpha+1} \\ & > \ln\frac{1}{p}\cdot\frac{n\ln\frac{1}{\theta_c}}{\alpha+1} = \frac{n}{\alpha+1}\cdot\ln(1/p)\cdot\ln(1/\theta_c). \end{aligned}

Looking for some reasonable numbers to plug in here, consider first that a low-information prior will have small \alpha \sim \mathcal{O}(1). Accordingly, let’s suppose \alpha = 1. If we generously (to the sponsor) allow \theta_c = 0.8 (requiring a 80% reduction in disease intensity to cross the threshold of clinical detectability), and choose p = 0.9 (so that we are asking for the therapeutic effect at the top decile of responses), then n\approx 200 enrolled to date in eteplirsen trials (noted in the top post) yields:

\text{mode}\left(\ln\frac{1}{\theta_{0.9}}\right) > \frac{200}{1+1}\cdot\ln\frac{1}{0.9}\cdot\ln\frac{1}{0.8} = 2.35 \doteq \ln\frac{1}{0.095},

corresponding to \theta_{0.9} < 0.1 — a quite dismal bound on efficacy.

Now it should be said that the mode (unlike the median) is not invariant under transformations. So this ‘correspondence’ doesn’t directly bound the modal \theta_{0.9} (on the \theta scale). Still, since the rightward skew of the \text{Inv-Gamma} distribution guarantees that

\text{median}\left(\ln\frac{1}{\theta_p}\right) > \text{mode}\left(\ln\frac{1}{\theta_p}\right),

we can at least state that median \theta_{0.9} < 0.1.

Thus, we conclude there’s a below-50% (Bayesian) chance the top decile of responses does better than a 10% reduction in disease activity — a strong suggestion that this is a truly marginal drug.

Does anybody see a flaw in the argument, or a mistake in my math?

While typing up a blog post on this issue, I discovered the first link in top post has disappeared from FDA website. :detective: For the record, here it is on the Wayback Machine. Maybe this is the safest way to cite any FDA URL in future! :bomb:

1 Like

It occurred to me yesterday that I might have gained at least some formal advantage by proceeding directly to a Bayesian expectation:

\begin{aligned} \text{E}[\text{P}(\theta_i < \theta \mid n)] &= \int_0^\infty \theta^\lambda\,p(\lambda\mid n) \,\text{d}\lambda \\ \\ &= \int_0^\infty \theta^\lambda \frac{\beta_n\,^\alpha}{\Gamma(\alpha)} \lambda^{\alpha-1} e^{-\beta_n\lambda}\,\text{d}\lambda \\ \\ &= \left(\frac{\beta_n}{\beta_n-\ln\theta}\right)^\alpha \int_0^\infty \frac{(\beta_n-\ln\theta)^\alpha}{\Gamma(\alpha)} \lambda^{\alpha-1} e^{-(\beta_n-\ln\theta)\lambda}\,\text{d}\lambda \\ \\ &= \left(\frac{\beta_n}{\beta_n-\ln\theta}\right)^\alpha \int_0^\infty \text{d}\,\text{Gamma}(\alpha, \beta_n-\ln\theta) \\ \\ &= \left(\frac{\beta_n}{\beta_n-\ln\theta}\right)^\alpha. \end{aligned}

Now this would seem to let us solve directly for \theta_pif only we can validly equate

p = \text{E}[\text{P}(\theta_i < \theta_p \mid n)]. \tag{$\star$}

This is because, taking logs on both sides of

p = \left(\frac{\beta_n}{\beta_n-\ln\theta_p}\right)^\alpha = \left(1 - \frac{\ln\theta_p}{\beta_n}\right)^{-\alpha},

we would obtain

-\frac{\ln p}{\alpha} = \ln\left(1 - \frac{\ln\theta_p}{\beta_n}\right) < -\frac{\ln\theta_p}{\beta_n},

from which we get

\ln\theta_p < \frac{\beta_n}{\alpha}\,\ln p \implies \theta_p < p^{\beta_n/\alpha} = p^{(\beta+n\ln(1/\theta_c))/\alpha} < p^{(n/\alpha)\ln(1/\theta_c)}.

Substituting the same p=0.9 and \theta_c=0.8 as above, we would obtain

\theta_{0.9} < 0.9^{(200/2)\ln(1/0.8)} \doteq 0.9^{22.3} \doteq 0.095.

BUT: I have serious doubts about Eq (\star), which seems to conflate two distinct kinds of distribution:

  1. The distribution of therapeutic effect \theta_i as i ranges over individuals in the population
  2. The Bayesian’s subjective probability over a family of these distributions, indexed by \lambda.

(Indeed, I’m reminded of the quote I’d previously referenced here.)

What do the Bayesians think?

1 Like