As an ecologist, I’m no stranger to complex datasets. But when it comes to mixed-effects modelling in R using glmmTMB, things can get hairy. I’m currently stuck on how to model my data without introducing pseudoreplication, and I’m hoping someone out there can offer some insights.
## The Problem
I have an outcome variable called ‘Score’, which is generated using the BirdNET algorithm across 9 different thresholds. I want to explore the association between ‘Score’ and several variables, including ‘avgAMP’, ‘L10AMP’, and ‘Richness’. The catch is that these variables are the same at each site-year combination, which means that simply fitting a model like this:
R
Precision_mod <- glmmTMB(Score ~ avgAMP + Richness * Thrsh + (1 | Site), family = "ordbeta", na.action = "na.fail", REML = F, data = BirdNET_combined)
would bias the model by introducing pseudoreplication.
## The Data
My dataset is in long format, with 110 sites across 3 years (2021, 2022, 2023). Each site has a value for 'Richness', 'avgAMP', and 'L10AMP', and we get a different 'Score' based on different thresholds.
## The Question
So, how do I model this data without introducing pseudoreplication? Any insights or advice would be greatly appreciated.
As a humble ecologist, I thank you for your time and support!