Determining the accuracy of wearable monitoring devices

R_cubed · November 11, 2024, 12:21pm

Using the standard deviation from the sample mean to detect outliers is problematic because the mean is the most sensitive summary statistic to any particular data point.

(Section 4.4.2 describes the mathematical properties of the mean that make it unsuitable as a measure of location for detecting abnormal values)

Finding highly influential observations (ie. “abnormal” values) requires either the use of a procedure like the jackknife to see how much the mean changes when an observation is removed, or the use of summary statistic that is less responsive to certain abnormal values, such as the median.

An old thread discusses the issue of dealing with outliers. I particularly like the intro sections of Thomas O’Gorman’s books on adaptive procedures that discusses how to think about the problem from a principled frequentist perspective. He has also provided comparative data on various textbook procedures using simulations that demonstrate the benefits and drawbacks relative to the adaptive procedures he derived.