Machine learning applications in epilepsy using high-dimensional imaging data

Victoria_Morgan · September 11, 2019, 2:52pm

I am an engineer applying neuroimaging methods to answer clinical questions in epilepsy. There is a new review paper out - machine learning applications in epilepsy https://onlinelibrary.wiley.com/doi/full/10.1111/epi.16333

I am particularly interested in section 4.1 as many of the the studies include the typical 50 or so patients and 50 or so controls and hundreds to thousands of imaging parameters per subject used in a ML to classify something like type of lesion (example ref 52: https://n.neurology.org/content/86/7/643). In section 5 there are examples of the same numbers of subjects and parameters used to predict treatment outcome (example: ref 79 “Deep learning applied to whole‐brain connectome to determine seizure control after epilepsy surgery”).

I have heard that with so many parameters, you need many more subjects to use ML methods. Do these types of studies perform other methods that “get around” this issue? These are published in high impact journals and are considered highly in this field. With similar data and similar goals I wonder if this is an approach I should be considering?

Thanks for any thoughts on this.

f2harrell · September 11, 2019, 4:50pm

Excellent question Vicky. I hope that others who have direct knowledge about machine learning in image analysis will respond. I have some general comments. I have yet to see a situation in medical research where “magic” happens and a machine learning technique suddenly works miracles with small sample sizes. Note this is in stark contrasts to what can be accomplished when visual pattern recognition algorithms are developed using machine learning in the extremely high signal:noise ratio situation such as recognizing letters of the alphabet or the head of a cat. Predicting treatment outcome is far, far more difficult than that and because of patient-to-patient variability the signal:noise ratio is low. So we are usually in situations where one needs 200 events per candidate feature.

One needs to recognize this not as a classification problem but as a problem of estimating tendencies (probabilities). To put this in the context of 50 diseased cases and 50 controls, the minimum sample size for estimating the overall gross average incidence of disease with no features whatsoever is 96. It is impossible to do something complex if the sample size is not adequate for making the most trivial estimate (an overall marginal proportion). And 96 only achieves a margin of error of +/- 0.1 in estimating the underlying probability.

I hope that others can make comments on the particular papers you referenced. In the meantime I share your implied skepticism and view all these published papers as contributing to the reproducibility crisis in science.