BBR Session 1: Biostatistics, Measurements, Optimum Y

Without having experience with that type of experiment, I’ll just say that the best approach in general is to have variables/parameters in a grand model that capture things like strain of microorganism.

In the video, we talked about how converting a continuous variable into categories is a bad thing to do. But sometimes you may want to do that if you are using a linear model and you believe that the relationship between the dependent variable and the covariate is non-linear. Another example would be if you have the state as a covariate and you are using a random forest you cannot directly plug in state into the random forest model since there are too many categories. How do we deal with this sort of discrepancy?

You could relax the assumption on linearity and go with something like a spline regression. Your selection of knots would help characterize the “categories” of interest. As far as too many categories, I believe the general rule when it comes to knot selection is 3 or 4, but this is dependent on the data.

During the first lecture, someone requested some texts on study design.

Here are a few that I have used and found useful:

I would love to hear other people’s suggestions as well.