Hi,
I’m a PhD candidate in cell biology that (in retrospect) has never received extensive statistical training on how to best analyse the most common experimental design in my field. Most of what I am asking here is inspired from self-study of literature by Lazic, Harrell, Gelman, Lakens, and McElreath but I feel a need for clarifying some questions. I hope, I am in an appropriate place here.
Background
Very often, cell biology experiments are influenced by variations spanning multiple levels.
But we commonly analyse (I learned how to analyse) variability in these experiments using t-tests or similar methodologies only at one of these levels.
As an example, a typical experiment for us uses neurons from mouse brains (level 1) that we grow in separate multi-well dishes (level 2) where we can apply separate treatments to each well. The readouts of these experiments often are quantified for individual cells (level 3). Usually the variability between cells (level 3) is much larger than the variability between animals (or culture dishes).
- Common practice for a long time was to estimate treatment effects only at level 3 (the level with largest variability but also most observations) while ignoring variabilities introduced by the other levels.
- As this analysis does, however, violate the independence of treatment assumption (several cells from level 3 are in the same well of a multi-dish plate when the treatment is applied), a “safe suggestion” was to average treatment effects over cells and treat the individual mouse brains (level 1) as the experimental unit (e.g. propsed by * What exactly is ‘N’ in cell culture and animal experiments?).
None of these approaches, however, specifically estimates the variance of the treatment. Analysing at level 3 (cells) merges variability between cells with the variability of the treatments, analysing at level 1 (animals) ignores cell-to-cell variability and is strongly influenced by animal-to-animal variability.
My questions
I assume treatment effects and variability to be the most relevant parameters for most research questions. But is this assumption fair?
- Is it on the contrary more interesting (or robust) to estimate the treatment variance always in connection with the variability between the cells or experimental system the treatment is applied, to strengthen conclusions? I am thinking also about replicability in further experiments here, which might be influenced by the variability between experiments as well as by treatment variabilities.
- If treatment variability is indeed most relevant for robust quantification of scientific experiments, shouldn’t we focus on methodologies that model the other relevent sources of variability to improve precision of our treatment estimate? How can we do that without pseudreplicating (see following block)?
Furthermore, regarding methodologies to accurately estimate the various variances:
- If in such an experimental design a ‘readout per cell’ would be modeled with a mixed GLM with fixed factor ‘treatment’ and random factors ‘culture dish’ and ‘mouse brain’ - would this accurately isolate a treatment effect or would it still pseudoreplicate the experimental units (culture dish)?
- Would it alternatively be possible to include cell-to-cell variability as a random factor (each cell usually only gives one readout) or would that not be required?
- Or would it be best after all to summarize the readouts to the experimental unit and address animal-to-animal variability by normalization (e.g. by subtracting the mean(readout) per animal)?
I hope you can clarify some of my questions or point me to explanations of the topics. Especially cell biology related references would be very useful to me as I am constantly searching for material to forward to fellow researchers with similar questions.
Best, Joachim
PS: If you actually read until here, thank you for your interest This somehow turned out much longer than I intended… also partial answers would help me a lot