Troponin, its use and misuse to rule-out and rule-in heart attacks



Patients presenting the emergency department (ED) with symptoms the attending physician thinks could indicate a myocardial infarction (MI) will have blood drawn for measurement of troponins. Often this will be repeated after some time. Elevated (about a 99th percentile of a health population) together with a change and some other criteria are used to diagnose an MI. The clinical issue is that the prevalence of MI in such patients is low (2 to 25% depending on health system model) and traditionally many hours have been needed to “rule-out” a heart attack. With newer troponin assays (“high-sensitivity”) enabling measurement of low concentrations and the development of rapid-rule out strategies based on low troponin and our risk score thresholds the field has advanced so that in many patients an MI appears to be able to be safely ruled-out in a clinically meaningful proportion of patients after only a few hours (disclaimer - I’ve been involved in either developing, implementing or testing many of these algorithms). But, there are issues to move away from the “threshold” approach to using troponin concentrations as a continuous variable and thereby, hopefully, being able to better assist the physician & patient decision making. These include:

  1. Typically 30 to 50% of patients have troponin concentrations below the limit of detection of the assay. This introduced a “natural” threshold for that assay. It also means that modelling troponin as a continuous variable is difficult. I’ve tried some approaches but can’t say I’m comfortable with them. What would be the best approach to including troponin in logistic regression or other models?

  2. It would be disastrous for a patient to be sent home from the ED on the basis of some algorithm when they are actually having an MI. For this reason algorithms typically target very high sensitivities (>99%) or NPVs (>99.5%). This, though, means thresholds are often determined on the basis of only 1 or two patients (given the cohort sizes). Intuitively, I think this means we don’t end up with optimal thresholds (ie those with best specificity). While “thresholds” are an issue well discussed on this forum and many would suggest not to use them, I want to run the gauntlet and ask a question on the basis that they will be used. How can we best determine and validate optimal thresholds when there are so few events?

  3. The outcome event (MI) for these troponin based algorithms is itself troponin based. This suggests to me that they will be biased and it makes evaluation of novel assays (non-troponin) more difficult. How can we deal with this?


This entire post is an excellent question, and I look forward to seeing some of the discussion that ensues.

I just want to make one comment re: the use of “thresholds” - even @f2harrell understands that one must at some point make decisions like “Send this patient home from the ED or keep them for observation” and that they must eventually be based on a “threshold” of sorts. What we frequently caution against is dichotomizing data unnecessarily early in the decision process. If one creates a statistical model that returns the probability a patient is having an MI given a particular Tn value (and possibly other clinical inputs if available), and the decision is made that patients with probability(MI)>X% will be retained for further testing while patients with probability(MI)<X% will be sent home - I think many of us would look favorably upon that approach. What we often see in practice is data being dichotomized on the front end of that process - i.e. breaking troponin into “quartiles” or something like that before the model is constructed - which introduces several problems.


I very much appreciate your starting this discussion.

When troponin is the outcome variable, a very nice way to model it is with ordinal regression, e.g., the proportional odds ordinal logistic model. There will be a clumping of tied values at the lower detection limit, then troponin is treated continuously for the remainder of the values. When troponin is a predictor variable, a simple approximate approach is to have an indicator variable for “above detection limit” and add to that in the model a regression spline in the cube root of troponin.

As @ADAlthousePhD stated, optimal decision making comes from not dichotomizing anything until the best risk estimate is obtained. So I advise against seeking thresholds on any lab continuous predictor.

This circularity has puzzled me for some time. The only way I can understand how to interpret the result is if the outcome being predicted is “future troponin” where the predictors are measured at least a couple of days earlier. But it would be nice to find a large cohort study where a non-troponin gold standard ECG/CKMB/clinical non-troponin-based diagnosis was available in the dataset.

A Different Issue
In the troponin literature, I see major problems in how repeated troponin measurements are analyzed as predictors. See for example my commentary here. Previous research has ignored the typical finding that when a patient measurement is updated, the initial value does not have the same predictive importance as the updated value, rendering the change or fold-change ineffective as a predictor of an ultimate diagnosis or outcome. We need to settle this issue for repeated troponins for following the recommended analytical strategy outlined in the above commentary, using the serial baseline troponins as multiple predictors (properly transformed or splined). I’ll bet that we would find that the prevailing wisdom about serial troponin changes is flawed.


Thanks Andrew, I couldn’t agree more. Having said that, the problem still remains once we create predictive models with prob(MI)>X%. This is because in any model the troponin concentration is very dominant. Other variables including patient demographics, history, nature of pain, electrocardiogram results all contribute to such models, but only fractionally compared to troponin (I’m currently playing with different model approaches to try and improve things, but I’ve still got issues).


Thanks Frank… I’d not thought previously to include a separate indicator variable for “above the detection limit”… I shall certainly give this a go!


Very interesting thank you. Is there any specific reason to use the cube root transformation rather than logarithm, assuming troponin can never be zero, just below the detection limit?


As the troponin is not truly 0, why not simply impute it from a uniform distribution between 0 and the détectable level.


Thanks for raising this Frank. There are many issues related to repeat troponin measurements - in addition to what you raise as a statistical issue there are also analytical issues related to the accuracy of the assays and delta between repeated troponin measurements. I’m a physicist from way back and my first ever physics lesson at university taught me to add the errors of a measurements when subtracting the measurements themselves. With some of the MI rule-out algorithms deriving and advocating deltas of just a 2 or 3 ng/L I think the measurement issues are profound.


Thanks Jay, I do think this could be useful. However, I had some problems when I played with doing this once when I was trying to fit a smooth line through sensitivity verse troponin concentrations. The choice of distribution and the random placing of concentrations on that distribution made a large difference to the shape of the curve. This, though, could merely reflect that I needed many many more numbers.


Thanks for starting this discussion John - I think you’ve illustrated the issues well. I will catch up after ESC Congress has finished.


Log will work just as well as \sqrt[3]{x} in this case.


Multiple imputation can be a great way to deal with detection limits, but if you don’t have any real measurements available below the limit, it is impossible to impute without strong extrapolation assumptions, e.g., assuming a log-normal distribution for troponin. In other words our more nonparametric imputation methods such as predictive mean matching are ruled out.


Proposed Goal

I’d like to propose that we have the following research goal: To create a summary of continuous present (and possibly past) troponin levels that either represents

  1. the probability of a definite myocardial infarction, or
  2. a scale that represents the size/severity of infarction, with the lowest scale value signifying no evidence (from toponin) for any infarction


Troponin is not endogenous to the diagnosis of STEMI, and ~all patients who have STEMI get several troponin draws. That would be a reasonable instrument, if you were careful to model, e.g., different time to lab draw from arrival, etc.


I have always been troubled by the circularity of defining troponin thresholds using troponin itself to define MI to determine sensitivity and specificity of a threshold.
I have also been troubled by how close the 99th percentile is to the limit of detection. The assumption is that there is a distribution (normal?) that extends below the limit of detection.
There is an enormous amount of confusion in practice surrounding the interpretation and ordering of troponin. I wrote a paper on this with Harlan Krumholz and Sanjay Kaul that can be found here:
J Am Coll Cardiol. 2016 Nov 29;68(21):2365-2375. doi: 10.1016/j.jacc.2016.08.066.


I look forward to reading the paper in detail. At first look I see sensitivity and specificity used. Those quantities are at odds with decision making. So are upper limits of “normals”. To make optimum decisions you’d need to relate troponin levels to the probability of disease, not find the distribution of troponin in those without disease.


I like these ideas. 1 is something I work on. 2 is something I think of interest but would require multiple troponin measurements because of its time course. Previously for Acute Kidney Injury I tried to look at the area under the creatinine curve to describe severity, could something similar work here? Or is “peak” troponin enough?


a problem with using troponin alone is that sometimes analytical errors occur. The frequency and patient contributing factors are not entirely clear. Sometimes a high troponin concentration is not bad, could be part of the clinical course (cardiac surgery) or an endogenous factor that does not correlate with heart injury (heterophil antibody, macrocomplex, etc…)


A proper analysis of serial troponins, not done to date as far as I can tell, would reveal the proper role for the peak, and whether once you know the latest values are the earlier values relevant in any way. If later values are all-important, then area under the troponin curve would not be an optimal summary.


a patient could present with a hsTnI (abbott) of 38 ng/l with serial measurments of 45, 36, and then 42. The lack of change (within the imprecision of the assay), suggests no acute injury and another workup is initiated