Hopefully this is a fairly simple question for all of you, but my familiarity with multilevel models is a bit lacking.
I’ve been asked to work with a dataset with an outcome related to individual animals, some information on the individual animal and year, and then several variables describing the size and type of farm, as well as an individual identifier for farms. Many of these farms have contributed several animals to the dataset, with larger farms generally being more likely to have more observations. The objective is inferential modeling to estimate the temporal trend in the data.
Initially, I had planned to use clusters based on the farm identifiers and use animal information, year, and both farm size and type as fixed effects in the model, however I’m concerned that this would not be appropriate because the farm size and farm type variables are on the same level as the clustering.
Is a more reasonable approach to use a multilevel model that clusters first on farm size and farm type, then clusters within that by individual farms? Or would it be no better than clustering on individual farm without including farm size and type in the model?
Disclaimer: My program doesn’t discuss multilevel modeling much and sticks to frequentist statistics, so I’m still working on teaching myself these models and Bayesian statistics.