I wish to model blood metals concentration as an outcome, however the complication is that it was measured with a limit of detection - i.e. the lab equipment used is unable to determine concentrations below a certain threshold. So below that value all readings are the same (i.e. 0, or in my case the data was entered as the l.o.d. threshold value), and above the threshold that data is continuous. I’ve read that hurdle models could be used to model this kind of data structure, and as I plan to work in Stan via the brms R package, there are several hurdle models available: hurdle_poisson, hurdle_negbinomial, hurdle_gamma, hurdle_lognormal. The first two are obviously for count data so not appropriate to my data, but I understand that the gamma or lognormal options could be used for continuous data. However I don’t really know which one of those is appropriate in what circumstances or how to choose between them, and I was hoping some of the kind datamethods experts could advise ? Or do others have experience of different approaches to modelling such data ?
To illustrate the issue - this plot shows a histogram of methyl mercury blood concentration from NHANES data, facetted by whether the reading was below the l.o.d. or a valid reading. About 13% don’t meet the threshold: