This excellent paper by German et al. presents a computationally and statistically efficient way to jointly analyze the mean and the variance as a function of covariates. Their method is named *WiSER* for within-subject variance estimator by robust regression. The authors have two excellent applications: glucose variability in diabetic patients (with data collected 4x per year) and activity level variation (with data collected every 5 minutes). The need for efficient methods, especially with dense time series data per patient, is clear. WiSER applies to continuous response variables.

Most of the complexity of analyzing dense within-patient serial data comes from modeling the covariance structure of the multivariate data. Markov models do this by modeling conditional distributions. For example, a second-order Markov model conditions at each time t on measurements made at times t-1 and t-2. These previous measurements are treated no differently than baseline covariates. Once analysis is complete, one uses the model’s transition probability estimates to obtain unconditional estimates. The process of unconditioning on previous times involves computations that have nothing to do with the sample size, so they are feasible for all sample sizes. With the Markov model’s conditional independence assumption, the calculations ignore patient identifiers once the lagged measurements are computed within-patient. So the calculations are extremely fast because they are the same calculations on n patients each having m serial measurements as one would do on n \times m patients each having one measurement. In addition, Bayesian MCMC converges very quickly in these conditional independence models.

This leads to a question. Instead of having specialized models needing specialized software for jointly modeling the mean and the variability, why not use a Markov model to model variability separately? Wouldn’t this also be a little more interpretable? Variability might be captured by |Y(t) - Y(t-1)| conditional on baseline covariates and conditional on |Y(t-1) - Y(t-2)|, and Y(t-1), the latter variable probably being unnecessary if Y is transformed so that successive differences in transformed Y are independent of absolute Y. For example, Y(t-1) will be needed if Y should have been log-transformed but wasn’t, i.e., if differences are relative rather than incremental. If Y(t-1) is not needed we have a simple first-order Markov process in the 1-lag absolute differences.

The Markov approach would solve another problem that was not satisfactorily dealt with in the German et al article: in the accelerometer dataset, many 5-minute periods had patient activity levels of zero, so the distribution of Y has “clumping at zero”. A semiparametric Markov model can handle this elegantly. Semiparametric Markov longitudinal models are discussed in detail here.

I would be interested in others’ thoughts about the potential for the simpler Markov approach.