library(tidyverse)
library(psych)
library(lme4)
Exkurs: ICCs im Kontext der Beurteilerübereinstimmung mit Gemischten Linearen Modellen berechnen
Random Intercept Modelle zur Schätzung der Inter-Rater Reliabilität
Pakete laden
Beispieldatensatz laden
Wir verwenden hier einen Datensatz von Shrout and Fleiss (1979), der im psych Paket enthalten ist. Weitere Informationen zum Datensatz finden Sie auf der Hilfeseite ?ICC
.
# example from Shrout and Fleiss (1979)
<- matrix(c(
sf 9, 2, 5, 8,
6, 1, 3, 2,
8, 4, 6, 8,
7, 1, 2, 6,
10, 5, 6, 9,
6, 2, 4, 7),ncol=4,byrow=TRUE)
colnames(sf) <- paste("J",1:4,sep="")
rownames(sf) <- paste("S",1:6,sep="")
# matrix sf in data.frame umwandeln
<- as.data.frame(sf)
sf sf
J1 J2 J3 J4
S1 9 2 5 8
S2 6 1 3 2
S3 8 4 6 8
S4 7 1 2 6
S5 10 5 6 9
S6 6 2 4 7
ICCs berechnen mit dem psych Paket
ICC(sf)
Call: ICC(x = sf)
Intraclass correlation coefficients
type ICC F df1 df2 p lower bound upper bound
Single_raters_absolute ICC1 0.17 1.8 5 18 0.16477 -0.133 0.72
Single_random_raters ICC2 0.29 11.0 5 15 0.00013 0.019 0.76
Single_fixed_raters ICC3 0.71 11.0 5 15 0.00013 0.342 0.95
Average_raters_absolute ICC1k 0.44 1.8 5 18 0.16477 -0.884 0.91
Average_random_raters ICC2k 0.62 11.0 5 15 0.00013 0.071 0.93
Average_fixed_raters ICC3k 0.91 11.0 5 15 0.00013 0.676 0.99
Number of subjects = 6 Number of Judges = 4
See the help file for a discussion of the other 4 McGraw and Wong estimates,
Zur Wiederholung der ICCs aus der Diagnostik II Vorlesung im Bachelor
Diese Informationen sind übernommen aus der Hilfe Seite von ?ICC
:
ICC1: Each target is rated by a different judge and the judges are selected at random.
ICC2: A random sample of k judges rate each target. The measure is one of absolute agreement in the ratings.
ICC3: A fixed set of k judges rate each target. There is no generalization to a larger population of judges.
ICCs direkt mit lme4 schätzen
# Datensatz in Longformat umwandeln mit dem tidyverse
<- sf %>%
sf_long rownames_to_column(var = "subject") %>%
pivot_longer(cols = J1:J4, names_to = "rater", values_to = "rating")
sf_long
# A tibble: 24 × 3
subject rater rating
<chr> <chr> <dbl>
1 S1 J1 9
2 S1 J2 2
3 S1 J3 5
4 S1 J4 8
5 S2 J1 6
6 S2 J2 1
7 S2 J3 3
8 S2 J4 2
9 S3 J1 8
10 S3 J2 4
# ℹ 14 more rows
Modell mit random intercepts für subjects (one-way random effects ANOVA)
<- lmer(rating ~ 1 + (1|subject), data = sf_long)
mod1_sf summary(mod1_sf)
Linear mixed model fit by REML ['lmerMod']
Formula: rating ~ 1 + (1 | subject)
Data: sf_long
REML criterion at convergence: 113.6
Scaled residuals:
Min 1Q Median 3Q Max
-1.48624 -0.77485 -0.01922 0.86835 1.49054
Random effects:
Groups Name Variance Std.Dev.
subject (Intercept) 1.244 1.116
Residual 6.264 2.503
Number of obs: 24, groups: subject, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 5.2917 0.6844 7.732
ICC mithilfe der Schätzwerte im Output berechnen:
# ICC(1,1)
1.244 / (1.244 + 6.264)
[1] 0.1656899
# ICC(1,k)
1.244 / (1.244 + 6.264 / 4)
[1] 0.4427046
Modell mit random intercepts für subjects und raters (two-way random effects ANOVA)
<- lmer(rating ~ 1 + (1|subject) + (1|rater), data = sf_long)
mod2_sf summary(mod2_sf)
Linear mixed model fit by REML ['lmerMod']
Formula: rating ~ 1 + (1 | subject) + (1 | rater)
Data: sf_long
REML criterion at convergence: 91.3
Scaled residuals:
Min 1Q Median 3Q Max
-2.5153 -0.3782 0.3378 0.5360 0.8607
Random effects:
Groups Name Variance Std.Dev.
subject (Intercept) 2.556 1.599
rater (Intercept) 5.244 2.290
Residual 1.019 1.010
Number of obs: 24, groups: subject, 6; rater, 4
Fixed effects:
Estimate Std. Error t value
(Intercept) 5.292 1.334 3.967
ICCs mithilfe der Schätzwerte im Output berechnen:
# ICC(2,1)
2.556 / (2.556 + 5.244 + 1.019)
[1] 0.2898288
# ICC(2,k)
2.556 / (2.556 + (5.244 + 1.019) / 4)
[1] 0.6201249
# ICC(3,1)
2.556 / (2.556 + 1.019)
[1] 0.714965
# ICC(3,k)
2.556 / (2.556 + 1.019 / 4)
[1] 0.9093658