I’m currently struggling to specify the correct random effect structure to analyze the data of a virtual reality experiment we conducted with a patient and a healthy control group. I’ve read (and learned) a lot in the last months, but with my psychology background I have the feeling that I lack some of the background knowledge to apply this to my specific data structure. Therefore I would be very grateful for every hint, feedback or literature recommendation.
Shortly to our paradigm:
The participants were asked to read out the same nine questions (asking for advice or support) in the same order consecutively to eight different virtual characters (first, all questions are asked to the first virtual character, then to the second, etc., 72 trials in total). The answers of the virtual characters differed with regard to social acceptance and rejection and whether the virtual character explained their response or not. Therefore, the answers (= stimuli) can be characterized on the two factors 1) reaction (rejection/acceptance) and 2) explanation (no/yes).
After listening to each answer, participants were asked to assess the avatar’s benevolence towards the participant by adjusting a slider on a scale (our dependent variable). The slider started in the middle of the scale at the beginning of each of the eight conversations and did not jump back between trials, but remained at the height set by the participants.
Among the virtual characters, the number of answers that were rejecting or accepting as well as with and without explanations was balanced. The frequencies, combined occurrences, and sequence of all four types of answers were also evenly distributed across all virtual characters. The response pattern assigned to each character remained consistent across participants, assigning a distinct ‘personality’ to each virtual character. The presentation order of the eight virtual characters was randomized.
In short, we are interested in whether and how the groups differ in their ratings dependent on the experimental factors, so my fixed effects look like this: rating ~ group * reaction * explanation
According to the recommendations of Barr, I included a by-subject and a by-stimuli random intercept, to account for the crossed repeated measures structure (multiple observations per subject due to multiple stimuli, multiple observations per stimulus due to multiple subjects). From there I added by-subject random slopes for the experimental factors as well as their interaction and by-stimuli random slope for group to account for pseudoreplication:
(1 + reaction * explanation | subject) + (1+group|stimuli)
In addition to my interest in general feedback and suggestions, I still have two unresolved questions:
- I later reduced the by-subject random slopes of this model until it converged, but was hesitant to omit the group slope: without the by-stimuli random slope for group, the degrees of freedom for the interaction terms including group increased suspiciously to >4440 what is close to our number of data points. I therefore assumed that the dependencies in our data was not correctly accounted for and especially after reading Arnqvist2020 and Scnadola&Tidoni` feel like it would not be a good idea to omit the group random slope. But colleagues argued this might not be a problem, since one of the advantages of mixed model is that they analyze all data points instead of mean values and the random slope for group seemed wrong to them). Does anyone has an idea or suggestion where I can learn more about df’s in mixed models and whether exploding df’s are suspicious or fine?
- Since our virtual characters all had their specific answer pattern, the slider did not jump back to the middle after each rating and also the specific nature of the nine questions asked might have an influence on the ratings, we wanted to add the factors ‘character’ and ‘question’ to the random effects structure. I thought about nesting the stimuli in the characters and to add another slope for the questions, since the questions were the same for all characters, but the answers (=stimuli) differed dependent on the character. But each answer (=stimuli) is already connected to a unique combination of character and question, therefore the information seems to be redundant what might explain the singular fits). Does anyone has an idea or suggestion where I can learn more about how to properly account for design factors that might interfere with the experimental factors?
Thank you very much in advance, especially for reading this rather long post!