If we have seasonality in data (e.g. some daily measurement, where we assume things might be different depending on outside temperature, amount of sunshine, or pollen in the air etc.), we probably want to assume this varies approximately in the same way in different years. Especially so, if we have not measured the actual influence factor (like temperature etc.).

For simplicity, let’s say all my data is from the same location (no problem with hemisphere and latitude - any extra thoughts on that are of course also interesting). Then I can use e.g. day of the year and put it into my regression model e.g. in a spline or in a discretized version (e.g. factor level for each week or month = piecewise constant function - presumably inefficient).

I thought the most obvious thing might be to put x_{cr}=cos(r * day/365.25 * 2 * \pi) and x_{sr}=sin(r * day/365.25 * 2 * \pi) for r=1, \ldots, R into the model (i.e. a Fourier series basis that for sufficiently large R should approximate just about any periodic function pretty well).

Is that the most obvious way to do this, or am I overlooking something obvious here? Are there some simple and good alternatives? Anything that you’d expect to perform better on finite sample sizes/match any true underlying function better with fewer terms?