The mixed-effects analyses shown here were done with the program R (version 2.4.1), with the function `lmer`

in package `lme4`

(Bates, 2005). It is assumed that you have some basic knowledge of R (or S or S-Plus). For further assistance, you may consult my tutorial on Statistics with R and S-Plus, or other help information at the R project website `www.r-project.org`.

In the log below, user input commands are indicated by blue and bolder type, like this.

Program output is indicated by black monospace type, like this.

Comments and annotations are given in italic sans-serif type, with indented left margin, like this.

Start the program R. The console window of R will show something like the following:

R version 2.4.1 (2006-12-18) Copyright (C) 2006 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. Loading required package: utils [Previously saved workspace restored]

Online help for command or function

`xyz`

is available by typing help(xyz) at the R prompt, or by browsing the Help menu option. The help pages also contain highly informative examples that may be copy-pasted at the R prompt. `R> `

getwd()This function shows the default file path. It returns a character string of that path (in fact, a list of strings containing one element). Because the return value of this function is not assigned, it is displayed. The

The same function, with a

`[1]`

in the output tells us that the following element (string) is the first (and here, the only) element in the list of strings.The same function, with a

`dir`

argument, can also be used to change the current working directory.[1] "E:/Hugo/multilevel"

`R> `

x24 <- read.table(file="x24.txt",header=F)The function

This function returns a data frame object, which is assigned here to a data frame named

`read.table`

reads a data sheet or matrix from the specified file. The file specification is a character string enclosed in double quotes. The file is searched in the default working directory, see above. The function returns a data object of class `data.frame`

. Optionally, the first line of the ASCII data file specified as argument, may contain the names of the columns (variables), in which case we should add the argument `header=TRUE`

. Since our data file doesn't have such a first line, this option is switched off by specifying `header=FALSE`

(which is also the default value for this argument). Boolean values TRUE and FALSE may be abbreviated to their first character. This function returns a data frame object, which is assigned here to a data frame named

`x24`

by means of the assignment operator `<-`

. Without such assignment, the data are read from file, the contents of the resulting data frame is displayed on the screen, and the result is then lost. Warning: Assignments are carried out without checking whether the object already exists. In other words, if some object named `x24`

already exists, it will be overwritten by this assignment operation. Make sure that you do not lose valuable information by inadvertedly overwriting it. `R> `

head(x24)
Show the first 6 elements of the specified data object.

V1 V2 V3 V4 1 1 1 1 0.72383 2 1 2 1 0.06709 3 1 3 1 1.19462 4 1 4 1 1.72527 5 1 5 1 0.52362 6 1 6 1 1.09539

`R> `

dimnames(x24)[[2]] <- c("subj","item","cond","resp")
The data frame

`x24`

has two dimensions, viz. rows and columns. We'd prefer to assign helpful names to the columns, i.e. to the names of the elements in dimension number two, to replace the names `V1, V2, V3, V4`

. This is done by constructing a list of four strings, and assigning this to the names of the second dimension of object `x24`

.`R> `

# cond is coerced from numerical into factor`R> `

x24$cond <- as.factor(x24$cond)
The hash is the comment symbol; everything following the hash on the same line is ignored.

All variables in

`x24`

are numeric. In regression, this means that the continuous numerical values would be used, instead of the discrete categorical values. Replace variable `cond`

in data frame `x24`

by its own values but coerced (forced) into a special class named `factor`

, for discrete factors. `R> `

require(lme4)
The package

`lme4`

(Bates, 2005) is required for further analysis. If the package is not yet loaded, then do so. Any packages that are required by `lme4`

are also loaded. Loading required package: lme4 Loading required package: Matrix Loading required package: lattice [1] TRUE

`R> `

# empty model (8)`R> `

m8 <- lmer(resp~1+(1|subj)+(1|item),data=x24,method="ML")
This is the first call of function _{0(j0)}; it specifies an intercept for the effect of _{0(0k)}. The two random effects, of

The

`lmer`

. Study the help information for this function. The first argument is the regression formula. This formula contains the random components in parentheses; the residual is not specified. The sequence `(1|subj)`

corresponds to u`subj`

that is the same for all conditions. Likewise, the sequence `(1|item)`

corresponds to v`subj`

ects and of `item`

s, are crossed.The

`data`

argument contains the data frame object to use, and the `method`

argument specifies Maximum Likelihood as the estimation method (Pinheiro & Bates, 2000). The function returns an object containing the model fit and residuals; this object of class `lmer`

is assigned to a new object named `m8`

.`R> `

summary(m8)
Summarize the contents of

`m8`

.Linear mixed-effects model fit by maximum likelihood Formula: resp ~ 1 + (1 | subj) + (1 | item) Data: x24 AIC BIC logLik MLdeviance REMLdeviance 2087 2101 -1040 2081 2083 Random effects: Groups Name Variance Std.Dev. item (Intercept) 0.25664 0.50659 ## v_0(0k) subj (Intercept) 0.28828 0.53692 ## u_0(j0) Residual 0.54027 0.73503 ## e_i(jk) number of obs: 864, groups: item, 36; subj, 24 Fixed effects: Estimate Std. Error t value (Intercept) 0.0235 0.1406 0.1671 ## gamma_0(00)

Note that there is no standard error (or other indication of confidence) for the estimates in the random part. These variance components are given in units of variance and in units of standard deviation.

For estimates in the fixed part, thet value is given as a measure of confidence. The probability of t is not given, because the degrees of freedom cannot be determined: these could be either 36-1 or 24-1. For conservative testing, use the lowest of these two alternatives to determine an appropriate critical value.

For estimates in the fixed part, the

`R> `

m8v2 <- lmer(resp~-1+cond+(1|subj)+(1|item),data=x24,method="ML")
This intermediate model, not specified in the manuscript but nevertheless reported in Table 2, has the fixed effect of cond in its fixed part. Instead of listing only the intercept, in the fixed part of the regression formula, this call suppresses the intercept (

`-1`

) and adds the fixed effect of `cond`

. Because the condition effect is not given in the random part, this model assumes homoschedasticity and sphericity, much like a RM-ANOVA does. Hence the standard errors are the same for all three fixed estimates.`R> `

summary(m8v2)
Linear mixed-effects model fit by maximum likelihood Formula: resp ~ -1 + cond + (1 | subj) + (1 | item) Data: x24 AIC BIC logLik MLdeviance REMLdeviance 2045 2069 -1017 2035 2045 Random effects: Groups Name Variance Std.Dev. item (Intercept) 0.25789 0.50783 ## v_0(0k) subj (Intercept) 0.28913 0.53771 ## u_0(j0) Residual 0.51033 0.71437 ## e_i(jk) number of obs: 864, groups: item, 36; subj, 24 Fixed effects: Estimate Std. Error t value cond1 0.21606 0.14485 1.4916 ## gamma_H cond2 0.04569 0.14485 0.3154 ## gamma_N cond3 -0.19127 0.14485 -1.3204 ## gamma_L

`R> `

anova(m8,m8v2)
We can evaluate whether this model is better than the predecessor one, with the same random part but without the fixed effect of

`cond`

, by comparing their likelihood ratios using function `anova`

with the two models as two parameters. This method of evaluating fixed effects may be too liberal (Pinheiro & Bates, 2000), especially if there are few observations and many model parameters. Data: x24 Models: m8: resp ~ 1 + (1 | subj) + (1 | item) m8v2: resp ~ -1 + cond + (1 | subj) + (1 | item) Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) m8 3 2086.7 2101.0 -1040.3 m8v2 5 2044.8 2068.6 -1017.4 45.901 2 1.079e-10 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Here there are many observations, and only 5 model parameters, so there is little chance of inflated significance. The effect of treatment condition on the observed response is clearly significant.

`R> `

# full model (9)`R> `

m9 <- lmer(resp~-1+cond+(-1+cond|subj)+(1|item),data=x24,method="ML")
This full model has the effect of

`cond`

in its fixed part, and in the random part at the participant level. Hence this model does not require homoschedasticity nor sphericity. `R> `

summary(m9)
Linear mixed-effects model fit by maximum likelihood Formula: resp ~ -1 + cond + (-1 + cond | subj) + (1 | item) Data: x24 AIC BIC logLik MLdeviance REMLdeviance 2047 2095 -1014 2027 2036 Random effects: Groups Name Variance Std.Dev. Corr item (Intercept) 0.25344 0.50342 ## v_0(0k) subj cond1 0.25067 0.50067 ## u_H 0(j0) cond2 0.30652 0.55364 0.922 ## u_N 0(j0) cond3 0.35956 0.59964 0.898 0.960 ## u_L 0(j0) Residual 0.49429 0.70306 ## e_i(jk) number of obs: 864, groups: item, 36; subj, 24 Fixed effects: Estimate Std. Error t value cond1 0.21606 0.13857 1.5593 ## gamma_H cond2 0.04569 0.14672 0.3114 ## gamma_N cond3 -0.19127 0.15407 -1.2414 ## gamma_L

Between-speaker variances seem to be different in the three treatment conditions, although this cannot be tested (yet) in _{H} are correlated with u_{N}, with r=.92. The different estimates of between-speaker variance in the random part lead to different standard errors for the three treatment conditions in the fixed part.
In the simulated data, the residual variances at the lowest level were not homogenenous either (cf. Note 1 in the manuscript), but this heteroschedasticity at the lowest level cannot be captured here.

`lmer`

. Speaker's deviances u2007.06.08 HQ