\documentclass[doc,apacite]{apa6}
\usepackage{Sweave}
\usepackage{graphicx}
\usepackage{rotating}
\usepackage{verbatim}
\usepackage{enumerate}
\usepackage{amssymb}
\usepackage[english]{babel}
\usepackage[utf8]{inputenc}
\usepackage{floatrow}
\floatsetup[table]{capposition=top}
% some commands for avoiding floats flushed to the end
\renewcommand{\topfraction}{0.9} % max fraction of floats at top
\renewcommand{\bottomfraction}{0.8} % max fraction of floats at bottom
\setcounter{topnumber}{2}
\setcounter{bottomnumber}{2}
\setcounter{totalnumber}{4} % 2 may work better
\renewcommand{\textfraction}{0.07} % allow minimal text w. figs
\renewcommand{\floatpagefraction}{0.7} % require fuller float pages
\DefineVerbatimEnvironment{Sinput}{Verbatim}
{formatcom={\vspace{-0.5ex}},fontshape=sl, % changed -2.5 to -0.5
fontfamily=courier,fontseries=b, fontsize=\scriptsize}
\DefineVerbatimEnvironment{Soutput}{Verbatim}
{formatcom={\vspace{-0.5ex}},fontfamily=courier,fontseries=b,%
fontsize=\scriptsize}
\title{Modelling Non-linear Relationships in ERP Data Using Mixed-effects Regression with R Examples}
%\VignetteIndexEntry{Modelling Non-linear Relationships in ERP Data Using Mixed-effects Regression with R Examples}
\shorttitle{Non-linear Relationships in ERP Data.}
\twoauthors{Antoine Tremblay}{Aaron J. Newman}
\twoaffiliations{Dalhousie University, Halifax, Nova Scotia, Canada}{Dalhousie University, Halifax, Nova Scotia, Canada}
\leftheader{Tremblay \& Newman}
\abstract{
In the analysis of psychological and psychophysiological data, the relationship between two variables is often assumed to be a straight line. This may be due to the prevalence of the general linear model in data analysis in these fields, which makes this assumption implicitly. However, there are many problems for which this assumption does not hold. In this paper, we show that in the analysis of event-realted potential (ERP) data, the assumption of linearity comes at a cost and may significantly affect the inferences drawn from the data. We demonstrate why the assumption of linearity should be relaxed and how to model nonlinear relationships between ERP amplitudes and predictor variables within the familiar framework of generalized linear models, using restricted cubic splines and mixed-effects regression.
This paper has been written in \LaTeX{} using \texttt{Sweave} and \texttt{R}, and the source document is provided as supplementary material. The data, a \texttt{pdf} of this paper, the \texttt{.Rnw} file used to write it, and the \texttt{R} code used to generate all of the analyses, tables, and figures presented here are available in package \texttt{EEGLMERdata}. The package is available as supplementary material for this article, and also from \texttt{http://hdl.handle.net/10222/22146}. These resources can be used to work through the examples, and potentially act as a starting point for the reader's own forays into mixed-effects analysis.
}
\keywords{EEG; ERP; nonlinear relationships; LME; restricted cubic splines.}
\authornote{ Corresponding author: Antoine Tremblay, NeuroCognitive Imaging Laboratory, Dalhousie University, Halifax, NS B3H 4R2, Canada. Tel.: (902) 494-1911. A.T. was supported during data collection by a Social Sciences and Humanities Research Council of Canada (SSHRC) doctoral fellowship while in the Department of Linguistics at the University of Alberta, Edmonton, Canada (2007--2009). A.T. was supported during analysis and writing by a SSHRC post-doctoral fellowship in the Department of Psychology and Neuroscience at Dalhousie University, Halifax, Canada (2011--2013). A.J.N. was supported by a SSHRC Standard Research Grant.}
\begin{document}
\maketitle
{\section{Introduction}
{% first part of intro
Event-related potential (ERP) datasets are commonly collected in cognitive neuroscience experiments because they offer rich spatio-temporal information about brain activity during perception, cognition, and action. The potential power of ERP data comes with the cost that the datasets tend to be large and quite complex. Thus the richness of the data demands analytical techniques that are appropriate to fully describe both the complexity of the data, and control for factors that might interfere with such analyses. In trying to deal with such complex data, it is very common to assume that the relationship between ERP amplitudes/latencies and an independent variable is linear (i.e., monotonic). For example, in a study that investigates the relationship between age and amplitude of the N1 ERP component, one would assume that a given increase in age is associated with a consistent increase (or decrease) in the amplitude and/or latency of the N1, regardless of whether the increase is from 23 to 24 years old or from 69 to 70 years old.
In reality, there are many problems for which the answer is not a straight line \cite{Rupert2003, Pinheiro2000, Wood2006, WuZhang2006, Hastie1990, BaayenKuperman2010, TremblayBaayen2010, Kryuchkova2012} and ERPs are no exception. For example, \citeA{Carbon2005} investigated the effect of ``Thatcherized'' faces (in which the eyes and mouth regions are turned upside-down) on N170 amplitudes (a negative component occurring between 130 and 200 ms at occipito-temporal scalp sites). Pictures of normal and Thatcherized faces were presented at either 0, 90, or 180$^{\circ}$ rotation; only when presented upright are these perceived as severely distorted. \citeA{Carbon2005} found a nonlinear effect of rotation on N170 amplitudes in response to faces: Upright Thatcherized faces elicited a smaller N170 than Thatcherized faces rotated 90 or 180$^{\circ}$. The effect on N170 amplitudes of the latter two orientations, however, did not differ. \citeA{Boutheina2009} also investigated the effects of picture rotation on N170 amplitudes. They used pictures of normal faces that were rotated 0, 22.5, 45, 67.5, 90, 112.5, 157.5, or 180$^{\circ}$ and found a relatively complex nonlinear effect on N170 amplitudes and latencies \cite[]{Boutheina2009}. Numerous other examples of nonlinear effects on ERPs can be found in the literature, in areas such as development \cite{Webb2005}, rhythm perception \cite{PablosMartin2007}, and auditory processing \cite{Inui2010}.
The studies mentioned above did not assume linearity, which is a step forward. However, in keeping with the traditional ANOVA approach to statistical analysis, they tested for nonlinear relationships using the common practice of discretizing the variables of interest. That is, they have reduced inherently continuous measures, such as rotation angle (or stimulus duration, interstimulus interval, magnitude, frequency, etc.) to factor variables with only a few levels. Such a practice is likely attributable to the natural development of cognitive neuroscience from behavioural approaches to studying cognition that traditionally rely heavily on ANOVA models. However, it comes with problems of its own. Indeed, not only does discretization induce bias into the analysis, but it also decreases statistical power and increases the probability of finding spurious associations \cite{Cohen:1983, McCallum2002}. Fortunately, such factorization is neither desirable, nor necessary to detect nonlinear relationships.
}
{\subsection{Relaxing the assumption of linearity: A set of simple examples}
Assuming linearity when it does not hold may lead one (i) to over-look important structure in the data, (ii) to conclude that there is no relationship between two variables when in fact there is one, and (iii) to under-estimate the strength of the relationship between a dependent and an independent variable \cite{Sahai1997}. To illustrate these points, we simulated four data sets \cite[]{Wood2012, R300} and compare models that assume linearity to ones that do not (for the sake of this example, we will gloss over how the assumption of linearity was relaxed for the latter analyses). In Figure \ref{fig:example0}, the panels on the left show the models that were fitted assuming linearity, whereas the ones on the right show the models that were fitted without making this assumption. In the first simulated data set, shown in the top panels (A and B), the dependent (Y) variable does not correlate with the predictor variable, and neither of the two models report that it does. The 2 point difference in AIC indicates that both models are equivalent.\footnote{Smaller AIC values reflect more likely models. By convention, we consider that two models differ if the difference in AIC value is equal to or greater than 5, meaning that the model with the lower AIC is 12 times more like than the one with the higher AIC.}
Panels C and D show data in which there is a linear relationship between the response and predictor variables, which is captured by both linear and nonlinear models. Indeed, the nonlinear model fit results in a straight line since there is no nonlinearity in the relationship, and the AIC values indicates that both models are just as likely given the data. The simulated relationship between the dependent and independent variables in panels E and F is quadratic. The model for which linearity is assumed (panel E) not only fails to capture the quadratic relationship, but it also falsely concludes that there is no significant relationship between X and Y. The model shown in panel F, however, is able to capture the quadratic trend and comparison of the AIC values for the two models confirms that this is a better fit. In panels G and H, the simulated relationship is more complex, taking on a ``wavy'' function. Although the model for which linearity is assumed does find that there is a significant relationship between X and Y, the actual nature of the relationship is only part of the underlying truth. This leads the model to under-estimate the strength of the relationship between X and Y, and the accuracy of the prediction varies as a function of X. Conversely, the model for which linearity is not assumed, illustrated in panel H, is able to capture the true nature of the relationship, and correspondingly has a much lower AIC value.
{%%% load function plotGamTweaked
{\footnotesize
<>=
library(mgcv)
options(warn=-1)
plotGamTweaked<-function(x,residuals=FALSE,se=TRUE,
select=1,n=200,n2=n,seWithMean=FALSE,...){
i<-select
w.resid<-NULL
m<-length(x$smooth)
order<-attr(x$pterms,"order")
if(is.numeric(se)){
if(se<1){
se<-FALSE
}else{
se<-TRUE
}
}
if(se&&x$Vp[1,1]<=0){
se<-FALSE
warning("No variance estimates available")
}
if(residuals){
fv.terms<-predict(x,type="terms")
if (is.null(w.resid)){
w.resid<-x$residuals*sqrt(x$weights)
}
}
if(m>0){
if(x$smooth[[i]]$dim==1){
raw<-x$model[x$smooth[[i]]$term]
xx<-seq(min(raw),max(raw),length = n)
if(x$smooth[[i]]$by!="NA"){
by<-rep(1,n)
dat<-data.frame(x=xx,by=by)
names(dat)<-c(x$smooth[[i]]$term,x$smooth[[i]]$by)
}else{
dat<-data.frame(x=xx)
names(dat)<-x$smooth[[i]]$term
}
X<-PredictMat(x$smooth[[i]],dat)
first<-x$smooth[[i]]$first.para
last<-x$smooth[[i]]$last.para
p<-x$coefficients[first:last]
offset<-attr(X,"offset")
if(is.null(offset)){
fit<-X%*%p
}else{
fit<-X%*%p+offset
}
if(se){
if(seWithMean&&attr(x$smooth[[i]],"nCons")>0){
X1<-matrix(x$cmX,nrow(X),ncol(x$Vp),
byrow=TRUE)
meanL1<-x$smooth[[i]]$meanL1
if(!is.null(meanL1)){
X1<-X1/meanL1
}
X1[,first:last]<-X
se.fit<-sqrt(rowSums((X1%*%x$Vp)*X1))
}else{
se.fit<-sqrt(rowSums((X%*%x$Vp[first:last,
first:last])*X))
}
}
edf<-sum(x$edf[first:last])
xterm<-x$smooth[[i]]$term
pd.item<-list(fit=fit,dim=1,x=xx,raw=raw[[1]])
if(residuals){
pd.item$p.resid<-fv.terms[,length(order)+i]+w.resid
}
if(se)pd.item$se=se.fit
}else if(x$smooth[[i]]$dim==2){
xterm<-x$smooth[[i]]$term[1]
yterm<-x$smooth[[i]]$term[2]
raw<-data.frame(x=x$model[xterm][[1]],
y=x$model[yterm][[1]])
n2<-max(10,n2)
xm<-seq(min(raw$x),max(raw$x),length=n2)
ym<-seq(min(raw$y),max(raw$y),length=n2)
xx<-rep(xm,n2)
yy<-rep(ym,rep(n2,n2))
if(x$smooth[[i]]$by!="NA"){
by<-rep(1,n)
dat<-data.frame(x=xx,y=yy,by=by)
names(dat)<-c(xterm,yterm,x$smooth[[i]]$by)
}else{
dat<-data.frame(x=xx,y=yy)
names(dat)<-c(xterm,yterm)
}
X<-PredictMat(x$smooth[[i]],dat)
first<-x$smooth[[i]]$first.para
last<-x$smooth[[i]]$last.para
p<-x$coefficients[first:last]
offset<-attr(X,"offset")
if(is.null(offset)){
fit<-X%*%p
}else{
fit<-X%*%p+offset
}
if(se){
if(seWithMean&&attr(x$smooth[[i]],"nCons")>0){
X1<-matrix(x$cmX,nrow(X),ncol(x$Vp),
byrow=TRUE)
meanL1<-x$smooth[[i]]$meanL1
if(!is.null(meanL1)){
X1<-X1/meanL1
}
X1[,first:last]<-X
se.fit<-sqrt(rowSums((X1%*%x$Vp)*X1))
}else{
se.fit<-sqrt(rowSums((X%*%x$Vp[first:last,
first:last])*X))
}
}
edf<-sum(x$edf[first:last])
pd.item<-list(fit=fit,dim=2,xm=xm,ym=ym,raw=raw)
if(se){
pd.item$se=se.fit
}
if(residuals){
pd.item$p.resid<-fv.terms[,length(order)+i]+w.resid
}
}else{
pd.item<-list(dim=x$smooth[[i]]$dim)
}
}
if("xm"%in%names(pd.item)){
res<-list(xm=pd.item$xm,ym=pd.item$ym,mat=pd.item$fit,n=n)
# rows are xm and columns are ym
# rownames(mat)<-res$xm
# colnames(mat)<-res$ym
}else{
res<-list(x=pd.item$x,y=pd.item$fit,n=n)
}
if(se){
res$se=pd.item$se
}
if(residuals)res$partial.resids<-pd.item$p.resid
return(invisible(res))
}
@
}
}
{%%% load function simOsc
{\footnotesize
<>=
simOsc = function(
n=60, # grid size
Xdamped=0, # whether wave in X dimension should be damped
XdampedDegree=0.02, # degree of damping in X dimension
Xdomain=c(-3,3), # range for X
Xoffset = 0, # shift for horizontal dimension
Ydamped=0, # whether wave in Y dimension should be damped
YdampedDegree=0.02, # degree of damping in Y dimension
Ydomain=c(1.2,7.2), # range for Y
Yoffset = 0, # shift for vertical dimension
mirror = 1, # mirror in X-Y plane if mirror=-1
plot=TRUE # show plot of surface
) {
# the way the matrix is filled is crazy, fits with the requirements of image()
t = sin(2*(seq(Xdomain[1],Xdomain[2],length=n)+Xoffset))*exp(-XdampedDegree*(n:1)*Xdamped)
f = sin(seq(Ydomain[1], Ydomain[2],length=n)+Yoffset)*exp(-YdampedDegree*(1:n)*Ydamped)
m=matrix(0,n,n)
for (i in 1:n) {
m[i,] = t[i]*f
}
m = m*mirror
if (plot) {
image(m, col=topo.colors(30),axes=FALSE, xlab="Time", ylab="Predictor")
contour(m, add=TRUE)
box()
}
return(t(m)[n:1,]) # in normal matrix form
}
@
}
}
{%%% linear vs nonlinear
{\footnotesize
<>=
pdf(file="figs/fig-linearVSnonlinear.pdf",height=10,width=10)
par(mfrow = c(4, 2))
ylimit <- c(-2.5, 20)
#
# generate noise
#dat <- data.frame(x = 1:100, y = runif(100,min = -10, max = 10)) + 7.5
#save(dat,file="data/simAB.rda",compress="xz")
load("data/simAB.rda")
m <- gam(y ~ s(x, bs = "cs"), data = dat)
m.lm <- lm(y ~ x, data = dat)
myaic<-c(AIC(m),AIC(m.lm))
plot.dat <- plotGamTweaked(m, select = 1, n = nrow(dat))
# lm model 0
plot(dat$x, dat$y, ylim = ylimit, xlab = "variable X",
ylab = "variable Y", cex.lab = 1.5, cex.axis = 1.25)
title(main = "(A)", cex.main = 1.5, line = 3)
title(main = paste("p-value =", round(summary(m.lm)$coefficients[2, 4], 2),
sep=" "), cex.main = 1.5, line = 1.7)
title(main = paste("R-squared = ", round(summary(m.lm)$r.squared,
2), "; AIC = ", round(myaic[2]), sep = ""), cex.main = 1.5, line = 0.4)
abline(m.lm, col = "red", lwd = 2.5)
# gam model 0
plot(dat$x, dat$y, ylim = ylimit, xlab = "", ylab = "", cex.axis = 1.25)
title(main = "(B)", cex.main = 1.5, line = 3)
title(main = paste("p-value =",round(summary(m)$s.table[1,4],2), sep=" "),
cex.main = 1.5, line = 1.7)
title(main = paste("R-squared = ", round(summary(m)$r.sq, 2),
"; AIC = ", round(myaic[1]), sep = ""), cex.main = 1.5, line = 0.4)
lines(plot.dat$x, plot.dat$y + summary(m)$p.table["(Intercept)", "Estimate"],
col = "blue", lwd = 2.5)
#
# generate simulated data
load("data/simCD.rda")
#dat <- gamSim()
#save(dat,file="data/simCD.rda",compress="xz")
m <- gam(y ~ s(x1, bs = "cs"), data = dat)
m.lm <- lm(y ~ x1, data = dat)
myaic<-c(AIC(m),AIC(m.lm))
plot.dat <- plotGamTweaked(m, select = 1, n = nrow(dat))
# lm model 1
plot(dat$x1, dat$y, ylim = ylimit, xlab = "", ylab = "", cex.axis = 1.25)
title(main = "(C)", cex.main = 1.5, line = 3)
title(main = "p-value < 0.001", cex.main = 1.5, line = 1.7)
title(main = paste("R-squared = ", round(summary(m.lm)$r.squared,
2), "; AIC = ", round(myaic[2]), sep = ""), cex.main = 1.5, line = 0.4)
abline(m.lm, col = "red", lwd = 2.5)
# gam model 1
plot(dat$x1, dat$y, ylim = ylimit, xlab = "", ylab = "", cex.axis = 1.25)
title(main = "(D)", cex.main = 1.5, line = 3)
title(main = "p-value < 0.001",
cex.main = 1.5, line = 1.7)
title(main = paste("R-squared = ", round(summary(m)$r.sq, 2),
"; AIC = ", round(myaic[1]), sep = ""), cex.main = 1.5, line = 0.4)
lines(plot.dat$x, plot.dat$y + summary(m)$p.table["(Intercept)", "Estimate"],
col = "blue", lwd = 2.5)
#
# generate simulated data
ylimit <- c(-5, 8.5)
#load("data/simEF.rda")
#dat <- gamSim(4)
#save(dat,file="data/simEF.rda",compress="xz")
#dat<-dat[dat$fac=="1",]
#dat2<-dat
#dat2$y<-jitter(dat2$y,amount=2)
#dat<-rbind(dat,dat2)
#dat[dat$x2>0.3&dat$x2<=0.7,"y"]<-dat[dat$x2>0.3&dat$x2<=0.7,"y"]*1.25
#dat[dat$x2<=0.2,"y"]<-dat[dat$x2<=0.2,"y"]/2
#dat[dat$x2<=0.2,"y"]<-jitter(dat[dat$x2<=0.2,"y"],amount=2)
#save(dat,file="data/simEF_v2.rda",compress="xz")
load("data/simEF_v2.rda")
m <- gam(y ~ s(x2, bs = "cs"), data = dat)
m.lm <- lm(y ~ x2, data = dat)
myaic<-c(AIC(m),AIC(m.lm))
plot.dat <- plotGamTweaked(m, select = 1, n = nrow(dat))
# lm model
plot(dat$x2, dat$y, ylim = ylimit, xlab = "", ylab = "", cex.axis = 1.25)
title(main="(E)", cex.main = 1.5, line = 3)
title(main = paste("p-value =", round(summary(m.lm)$coefficients[2,4], 2),
sep=" "), cex.main = 1.5, line = 1.7)
title(main = paste("R-squared = ", round(summary(m.lm)$r.squared,
2), "; AIC = ", round(myaic[2]), sep = ""), cex.main = 1.5, line = 0.4)
abline(m.lm, col = "red", lwd = 2.5)
# gam model
plot(dat$x2, dat$y, ylim = ylimit, xlab = "", ylab = "", cex.axis = 1.25)
title(main = "(F)", cex.main = 1.5, line = 3)
title(main = "p-value < 0.001", cex.main = 1.5, line = 1.7)
title(main = paste("R-squared = ", round(summary(m)$r.sq, 2),
"; AIC = ", round(myaic[1]), sep = ""), cex.main = 1.5, line = 0.4)
lines(plot.dat$x, plot.dat$y + summary(m)$p.table["(Intercept)", "Estimate"],
col = "blue", lwd = 2.5)
#
# generate simulated data
ylimit <- c(-2.5, 20)
load("data/simGH.rda")
#dat <- gamSim()
#save(dat,file="data/simGH.rda",compress="xz")
m <- gam(y ~ s(x2, bs = "cs"), data= dat)
m.lm <- lm(y ~ x2, data = dat)
myaic<-c(AIC(m),AIC(m.lm))
plot.dat <- plotGamTweaked(m, select = 1, n = nrow(dat))
# lm model 3
plot(dat$x2, dat$y, ylim = ylimit, xlab = "", ylab = "", cex.axis = 1.25)
title(main = "(G)", cex.main = 1.5, line = 3)
title(main = "p-value < 0.001", cex.main = 1.5, line = 1.7)
title(main = paste("R-squared = ", round(summary(m.lm)$r.squared,
2), "; AIC = ", round(myaic[2]), sep = ""), cex.main = 1.5, line = 0.4)
abline(m.lm, col = "red", lwd = 2.5)
# gam model 3
plot(dat$x2, dat$y, ylim = ylimit, xlab = "", ylab = "", cex.axis = 1.25)
title(main = "(H)", cex.main = 1.5, line = 3)
title(main = "p-value < 0.001", cex.main = 1.5, line = 1.7)
title(main = paste("R-squared = ", round(summary(m)$r.sq, 2),
"; AIC = ", round(myaic[1]), sep = ""), cex.main = 1.5, line = 0.4)
lines(plot.dat$x, plot.dat$y + summary(m)$p.table["(Intercept)", "Estimate"],
col = "blue", lwd = 2.5)
par(mfrow=c(1,1))
dev.off()
@
}
}
\begin{figure}[h]
\centering
\includegraphics[width=0.9\textwidth,height=0.7\textheight]{figs/fig-linearVSnonlinear.pdf}
\caption{Assuming linearity (left column; red lines) versus not assuming linearity (right column; blue lines).}
\label{fig:example0}
\end{figure}
}
{\subsection{Modelling Non-linear Relationships}
Testing for nonlinear relationships can be seen as a natural extension of testing for linear relationships. Linear regression is familiar to virtually anyone with a modicum of training in statistics, and allows one to test for a linear relationship between two variables as shown in the left-hand columns of Figure \ref{fig:example0}. Thus for example in the analysis of the effects of image rotation in the \citeA{Carbon2005} study described above, the levels of rotation could have been treated as values along a continuum of possible rotation angles, rather than as categorical levels. The problem with this approach, as demonstrated in the examples in Figure \ref{fig:example0}, is that if the relationship is nonlinear then a linear fit will not be optimal. In contrast, pairwise testing of different ``levels'' of rotation is able to detect which (if any) pairs of levels differ. Certain relationships can also be detected using a set of linear contrast weights, as in Helmert or polynomial contrasts. However, this approach does not generalize easily to many levels of a factor, and because each pairwise or contrast test assumes independence correction for multiple comparisons should be applied --- potentially weakening sensitivity to extant effects.
Non-linear relationships can however be modelled without discretization in a number of ways. For example, one can use polynomials, cubic regression splines, thin plate regression splines, Duchon splines, splines on the sphere, P-splines, or Markov random fields. All these methods essentially apply a series of linear transformations to the predictor variables. In this paper, we use restricted cubic splines (RCS) as implemented by function {\tt rcs} from package {\tt rms} \cite{Harrell2012} to model nonlinear relationships.\footnote{We do not use polynomials because ``polynomials have some undesirable properties \cite[]{Magee1998} and will not fit adequately many functional forms \cite{Devlin1986}. For example, polynomials do not adequately fit logarithmic functions or `threshold' effects'' \cite<>[p. 18]{Harrell:2001}.} RCS are a set of cubic polynomial functions joined together at points termed ``knots''. The RCS function with $k$ knots $t_1, \dots, t_k$ is stated as follows \cite<>[p. 21, equation 2.26]{Harrell:2001}:
\begin{equation}
\label{eq:eq1}
f(X) = \beta_0 + \beta_1\mathbf{X} + \beta_2(\mathbf{X}-t_1)_+^3 +
\beta_3(\mathbf{X}-t_2)_+^3 + \dots + \beta_{(k+1)}(\mathbf{X}-t_k)_+^3
\end{equation}
with the conditions that the sections of cubic polynomial must meet at the knots and their slopes must be the same at these points. The basic idea is that the data are represented with a series of cubic polynomials -- the $\beta_j(\mathbf{X}-t_k)_+^3$ portions of equation~(\ref{eq:eq1}) -- that are joined together so as to create a continuous curve.
The {\tt rcs} function merely serves to transform variables. The actual modelling of the data will be performed with linear mixed-effects regression, which we briefly describe below.
}
{\subsection{Linear Mixed Effects Regression}
Linear mixed-effects regression (LME) is a relatively recent development in statistics, and is a natural tool for modelling repeated measures \cite{WuZhang2006, Wu2010}. Details about mixed-effects modelling, as well as its potential advantages over rmANOVA, can be found in a number of recent papers and books \cite[]{Pinheiro2000, Faraway2005, WuZhang2006, Gellman2007, Baayen:2008, Baayen2008, Quene2008, Zuur2009, Wu2010}. LME has been applied to ERP data in several published papers, including \citeA{Bagiella2000}, \citeA{Davidson2007}, \citeA{Moratti2007}, \citeA{Pritchett2010}, \citeA{Wierda2010}, \citeA{Hsu2011}, \citeA{Vossen2011}, and \citeA{Newman2012}.
Like ANOVA, LME also fits within the generalized linear mixed effects model (GLMM). ANOVA and LME differ in that (1) LME can model more complex random-effects structures \cite[]{Baayen2008, Quene2008}; (2) LME uses (restricted) maximum-likelihood methods for parameter estimation \cite[pp. 62--81 for more details]{Pinheiro2000} whereas ANOVA uses the least squares method; (3) LME is robust to violations of sphericity if the correct random effect structure is used \cite{Bagiella2000, Baayen2008}, eliminating the need to correct for this post hoc using methods that are known to be either overly conservative (Greenhouse-Geisser) or liberal (Huynh-Feldt), and (4) LME enables one to appropriately model imbalanced data \cite{Bagiella2000, Gellman2007, Baayen2008}, a common situation in ERP data.
To fully make use of these capabilities, data should not be a\-ve\-raged across trials or any other variable. Rather, the LME model should be fitted on the un-a\-ve\-raged (``single trial'') data. Beyond allowing robust, accurate estimates of variance, this practice has the added benefit of allowing one to model nonlinear relationships between dependent and independent variables in multi-dimensional space. While at first it may seem odd to describe nonlinear modelling as using ``linear'' mixed effects, the generalization from the assumption of a linear relationship to a nonlinear one is readily accommodated by LME.
}
{\subsection{Goals of the Present Study}
The goal of this paper is to demonstrate how relaxing the assumption of linearity can lead to better modelling of ERP data. We also demonstrate how to model nonlinear relationships using restricted cubic splines and mixed-effects regression. This paper has been written in \LaTeX{} using \texttt{Sweave} and \texttt{R}, and the source document is provided as supplementary material. The data, a \texttt{pdf} of this paper, the \texttt{.Rnw} file used to write it, and the \texttt{R} code used to generate all of the analyses, tables, and figures presented in this paper are available in package \texttt{EEGLMERdata}. The package is available as supplementary material or from \texttt{http://hdl.handle.net/10222/22146}. These resources can be used to work through the examples, and potentially act as a starting point for the reader's own forays into LME analysis. This paper can, however, be read and understood without viewing the accompanying \texttt{R} code.
}
}
{\section{Example ERP Data}
{% intro to section
The ERP data used here was collected in the context of an immediate free recall experiment described in \citeA{Tremblay2009} and \citeA{TremblayBaayen2010}. The goal of this experiment was to investigate whether frequently used four-word sequences, such as \textit{end of the year}, \textit{I don't really know}, and \textit{at the same time}, may be (de)composed via the application of compositional rules or stored and retrieved as wholes (in which case they should show effects of the frequency of co-occurrence of the four-word sequence). This was achieved by examining whether the frequency of such phrases affected early ERP components such as the anterior N1 (N1a) and the posterior P1. Both of these components peak approximately 110--150 ms after stimulus presentation, and were previously found to be modulated in studies of single words by linguistic variables including frequency and length \cite{sereno:rayner:posner:1998, Assadollahi2003, HaukPulvermuller2004, Hauk2006a, Murphy2006, Penolazzi2007}. \citeA{Tremblay2009} reported that higher-frequency four-word sequences indeed elicited more negative N1 and less positive P1 amplitudes than low-frequency sequences.
However, additional data were collected in this study that were not analyzed in \citeA{Tremblay2009} or \citeA{TremblayBaayen2010}. In the present paper, we investigate whether working memory capacity, the length of the second word of a sequence, and the position in a list where a four-word sequence was presented (i.e., whether it was presented first, second, \ldots, fifth, or sixth in a block), and their interaction affected ERPs in the memory task.\footnote{The second word of a sequence appeared, more often than not, right where the fixation cross was presented (centre of the screen). Thus, the second word of a sequence was the first word that participants saw. It stands to reason that the longer the second word of a sequence, the greater the amplitude of the N1 should be given that longer words elicit larger anterior negativities \cite{Murphy2006}.}
In the analyses presented here we focused on the N1a as the ERP component of interest. The N1a is known to be sensitive to spatial attention as well as lexical frequency, probability of occurrence, and word length \cite[and references cited therein]{Luck2005, Murphy2006, Penolazzi2007}. For instance, high-frequency words elicit greater N1a amplitudes than low-frequency words, and longer words elicit larger N1a amplitudes than shorter ones. Because in this paper our focus is on demonstrating nonlinear analysis and comparing it to a linear approach, we limited our analyses to a single electrode (Fz, located along the midline, midway between the nose and the vertex of the head) where the N1a component was maximal, and we do not discuss the implications of our results with respect to the cognitive or linguistic literature.
}
{\subsection{Participants}
Ten right-handed female students from the University of Alberta were paid for their participation in the experiment. (Mean age = 23.4; \emph{SD} = 1.6; min/max = 22/27). All were healthy native speakers of English, had normal or corrected-to-normal vision, and did not report any neurological deficits. Participants had a mean handedness score \cite{oldfield:1971} of 79.5/100 (\textit{SD} = 15.8). We assessed participants' reading span and working memory capacity (henceforth WMC) using an adaptation of the \citeA{Daneman1980} test presented on a PC using E-Prime (Mean WMC score = 73/100; \textit{SD} = 10.4).
}
{\subsection{Study design}
The stimulus list consisted of 432 four-word sequences taken from the \emph{British National Corpus} \cite{davies:2004}. Some examples are \textit{end of the year}, \textit{I don't really know}, \textit{at the same time}, \textit{I have to say}, \textit{it would be a}, \textit{at the age of}, \textit{this is not a}, \textit{we've got to get}, and \textit{I think it's the}. Frequencies, which were obtained from the \emph{Variations in English Words and Phrases} search engine \cite{fletcher:2008}, ranged from 0.03 to 105 occurrences per million. The stimuli were divided into 72 blocks. Each block was divided into 18 trials, where in each trial six four-word sequences were randomly presented one at a time in the middle of the screen with an inter-stimulus interval of roughly 4000 ms. Sequences subtended on a\-ve\-rage $\sim5^{\circ}$ x $0.4^{\circ}$ visual angle; the longest four-word string (\textit{becoming increasingly clear that}) subtended $\sim8^{\circ}$ x $0.4^{\circ}$ visual angle. At the end of each trial (i.e., after having seen six four-word sequences), participants were prompted to type in as many sequences as they could recall.
}
{\subsection{EEG Recordings and Processing}
Electroencephalogram (EEG) recordings were made with active Ag/AgCl electrodes from 32 locations according to the international 10/20 system (\texttt{www.biosemi.com/headcap.htm}) at the midline (Fz, Cz, Pz, Oz) and left and right hemisphere (Fp1, Fp2, AF3, AF4, F3, F4, F7, F8, FC1, FC2, FC5, FC6, C3, C4, T7, T8, CP1, CP2, CP5, CP6, P3, P4, P7, P8, PO3, PO4, O1, O2), as well as the right and left mastoids, and referenced online at the common mode sense active electrode. Electrodes were mounted on a nylon cap. Eye movements were monitored by electrodes placed above and below the left eye and at the outer canthi of both eyes, which were bipolarized off-line to yield vertical and horizontal electro-oculograms (EOG). Analog signals were sampled at 8,192 Hz using a BioSemi (Amsterdam, The Netherlands) Active II digital 24 bits amplification system with an active input range of $\pm262$ mV per bit and were band-pass filtered between 0.01 and 100 Hz. Note that Biosemi uses active electrodes and therefore can tolerate high scalp impedances.\footnotemark\footnotetext{See \texttt{http://www.biosemi.nl/forum/viewtopic.php?t=486\&highlight=impedance}}
The digitized EEG was initially processed off-line using \texttt{Analyzer} version 1.05 (Brain Products GmbH, Gilching, Germany): It was re-referenced to the a\-ve\-rage of the right and left mastoids, downsampled to 128 Hz, band-pass filtered from 0.01 to 30 Hz using a forward-backward filter combination where each of the filters was comprised of a two pole zero phase infinite impulse response Butterworth filter, and corrected for eye movements and eye blinks by regressing out the vertical and horizontal EOGs \cite{Gratton:Coles:Donchin:1983}. The processed signal was then segmented into epochs of 3000 ms (1500 ms before and after stimulus onset). Each epoch was baseline corrected on the 1500 ms segment immediately preceding stimulus onset.
}
}
{\section{Results}
{% intro to section
The three predictor variables (fixed effects) under investigation were (1) the position of the four-word sequence in the set of six items presented in a block (Position), (2) the length of the second word in the four-word chunk (Length)\footnote{The second word was chosen because it most often overlapped the center of the monitor where subjects fixated between trials}, and (3) the working memory capacity of the participant (WMC). These predictor variables were transformed prior to plotting or analysis. We first mean-centered Position, Length, and WMC by subtracting the mean of a variable from each of its values. The main purpose of this centering was to increase numerical stability and remove any spuriously high correlation that may arise between random intercepts and random slopes as well as between fixed and random variables \cite{Hofman1998, Kreft1998, Baayen:2008}. We first present the results from the behavioural analysis and then move on to the ERP analysis.
}
{\subsection{Behavioural Results}
{% R code: behavioural analysis
% eval=FALSE
{\footnotesize
<>=
data(vars)
vars$LengthBc<-vars$LengthB-mean(vars$LengthB)
vars$WMCc<-vars$WMC-mean(vars$WMC)
vars$Positionc<-vars$Position-mean(vars$Position)
# look at probabilities
x1<-table(vars$Recalled,vars$Position)
x1<-x1[2,]/(x1[2,]+x1[1,])
x2<-table(vars$Recalled,vars$LengthB)
x2<-x2[2,]/(x2[2,]+x2[1,])
x3<-table(vars$Recalled,vars$WMC)
x3<-x3[2,]/(x3[2,]+x3[1,])
#
ylimit<-c(0,0.6)
pdf(file="figs/raw_prob_behavioural.pdf")
par(mfrow=c(2,2))
plot(x1,type="l",xlab="Position",ylim=ylimit,ylab="")
plot(as.numeric(names(x2)),x2,type="l",xlab="LengthB",
ylim=ylimit,ylab="prob. of correctly recalling a sequence")
plot(as.numeric(names(x3)),x3,type="l",xlab="WMC",
ylim=ylimit,ylab="",xaxt="n")
wmc2<-sort(unique(vars$WMC))
wmc2<-round(seq(min(wmc2),max(wmc2),length.out=6),2)
axis(side=1,at=wmc2,labels=wmc2,cex.axis=0.9)
par(mfrow=c(1,1))
dev.off()
#
library(rms)
m1 <- glmer(Recalled ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject), data = vars, family = "binomial")
#
m2 <- glmer(Recalled ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject) + (1 | Item), data = vars, family = "binomial")
relLik(m1,m2)
#
m3 <- glmer(Recalled ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(WMCc, 4) | Item), data = vars,
family = "binomial")
relLik(m2,m3)
#
m4 <- glmer(Recalled ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(LengthBc, 3) | Subject),
data = vars, family = "binomial")
relLik(m2,m4)
#
m5 <- glmer(Recalled ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(Position, 3) | Subject),
data = vars, family = "binomial")
relLik(m2,m5)
#
m6 <- glmer(Recalled ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(Position, 3) | Subject) +
(0 + rcs(Position, 3) | Item), data = vars, family = "binomial")
relLik(m5,m6)
#
# save models
save(m1, file = "models/m1FzRCSbehavioral.rda", compress = "xz")
save(m2, file = "models/m2FzRCSbehavioral.rda", compress = "xz")
save(m3, file = "models/m3FzRCSbehavioral.rda", compress = "xz")
save(m4, file = "models/m4FzRCSbehavioral.rda", compress = "xz")
save(m5, file = "models/m5FzRCSbehavioral.rda", compress = "xz")
save(m6, file = "models/m6FzRCSbehavioral.rda", compress = "xz")
#
# m1 vs. m2
tmp <- anova(m1, m2)
class(tmp) <- "data.frame"
comp <- data.frame(Comparison = "m1 vs m2", AIC.x = tmp$AIC[1],
AIC.y = tmp$AIC[2], relLik = exp((abs(tmp$AIC[1]-tmp$AIC[2]))/2),
Chisq = tmp$Chisq[2], Chi.df = tmp$`Chi Df`[2],
p.value = tmp$`Pr(>Chisq)`[2])
#
# m2 vs. m3
tmp <- anova(m2, m3)
class(tmp) <- "data.frame"
comp <- rbind(comp, data.frame(Comparison = "m2 vs m3", AIC.x = tmp$AIC[1],
AIC.y = tmp$AIC[2], relLik = exp((abs(tmp$AIC[1]-tmp$AIC[2]))/2),
Chisq = tmp$Chisq[2], Chi.df = tmp$`Chi Df`[2],
p.value = tmp$`Pr(>Chisq)`[2]))
#
# m2 vs. m4
tmp <- anova(m2, m4)
class(tmp) <- "data.frame"
comp <- rbind(comp, data.frame(Comparison = "m2 vs m4", AIC.x = tmp$AIC[1],
AIC.y = tmp$AIC[2], relLik = exp((abs(tmp$AIC[1]-tmp$AIC[2]))/2),
Chisq = tmp$Chisq[2], Chi.df = tmp$`Chi Df`[2],
p.value = tmp$`Pr(>Chisq)`[2]))
#
# m2 vs. m5
tmp <- anova(m2, m5)
class(tmp) <- "data.frame"
comp <- rbind(comp, data.frame(Comparison = "m2 vs m5", AIC.x = tmp$AIC[1],
AIC.y = tmp$AIC[2], relLik = exp((abs(tmp$AIC[1]-tmp$AIC[2]))/2),
Chisq = tmp$Chisq[2], Chi.df = tmp$`Chi Df`[2],
p.value = tmp$`Pr(>Chisq)`[2]))
#
# m5 vs. m6
tmp <- anova(m5, m6)
class(tmp) <- "data.frame"
comp <- rbind(comp, data.frame(Comparison = "m5 vs m6", AIC.x = tmp$AIC[1],
AIC.y = tmp$AIC[2], relLik = exp((abs(tmp$AIC[1]-tmp$AIC[2]))/2),
Chisq = tmp$Chisq[2], Chi.df = tmp$`Chi Df`[2],
p.value = tmp$`Pr(>Chisq)`[2]))
rownames(comp) <- comp$Comparison
comp <- comp[, 2:7]
#
pval<-comp[,6]
tmp<-vector("character")
for(i in 1:length(pval)){
if(pval[i]<0.001){
tmp<-c(tmp,"< 0.001")
}else{
tmp<-c(tmp,as.character(round(pval[i],3)))
}
}
comp[,6]<-tmp
#
relLik<-comp[,3]
tmp<-vector("character")
for(i in 1:length(relLik)){
if(relLik[i]>1000){
tmp<-c(tmp,"> 1000")
}else{
tmp<-c(tmp,as.character(round(relLik[i],1)))
}
}
(comp[,3]<-tmp)
# AIC.x AIC.y relLik Chisq Chi.df p.value
# m1 vs m2 4728.256 4672.618 > 1000 57.637998 1 < 0.001
# m2 vs m3 4672.618 4680.590 53.8 4.028516 6 0.673
# m2 vs m4 4672.618 4678.635 20.2 0.000000 3 1
# m2 vs m5 4672.618 4586.754 > 1000 91.864055 3 < 0.001
# m5 vs m6 4586.754 4581.272 15.5 11.482516 3 0.009
#
### best model is m6
#
save(comp, file = "smry/behavioralComparisons.rda", compress = "xz")
#
# is three way interaction significant?
m7<-lmer(Recalled ~ (rcs(Position, 3) + rcs(LengthBc, 3) + rcs(WMCc, 4))^2 +
(1 | Subject) + (1 | Item) + (0 + rcs(Position, 3) | Subject) +
(0 + rcs(Position, 3) | Item), data = vars, family = "binomial")
#
# is there a Length $\times$ WMC interaction?
m8<-lmer(Recalled ~ rcs(Position, 3) * rcs(LengthBc, 3) + rcs(Position, 3) *
rcs(WMCc, 4) + (1 | Subject) + (1 | Item) + (0 + rcs(Position, 3) | Subject)
+ (0 + rcs(Position, 3) | Item), data = vars, family = "binomial")
#
# is there a Position X WMC interaction?
m9<-lmer(Recalled ~ rcs(Position, 3) * rcs(LengthBc, 3) + rcs(LengthBc, 3) *
rcs(WMCc, 4) + (1 | Subject) + (1 | Item) + (0 + rcs(Position, 3) | Subject)
+ (0 + rcs(Position, 3) | Item), data = vars, family = "binomial")
#
# is there a Position X Length interaction?
m10<-lmer(Recalled ~ rcs(Position, 3) * rcs(WMCc, 4) + rcs(LengthBc, 3) *
rcs(WMCc, 4) + (1 | Subject) + (1 | Item) + (0 + rcs(Position, 3) | Subject)
+ (0 + rcs(Position, 3) | Item), data = vars, family = "binomial")
#
save(m7, file = "models/m7FzRCSbehavioral.rda", compress = "xz")
save(m8, file = "models/m8FzRCSbehavioral.rda", compress = "xz")
save(m9, file = "models/m9FzRCSbehavioral.rda", compress = "xz")
save(m10, file = "models/m10FzRCSbehavioral.rda", compress = "xz")
# m7 vs. m6 -- is three way interaction significant?
tmp <- relLik(m7, m6)
#
# m8 vs. m7 -- is there a Length $\times$ WMC interaction?
tmp <- rbind(tmp,relLik(m7, m8))
tmp <- as.data.frame(tmp)
rownames(tmp)<-c("m7 vs. m6","m7 vs. m8")
#
# m9 vs. m7 -- is there a Position X WMC interaction?
tmp <- rbind(tmp,relLik(m7, m9))
tmp <- as.data.frame(tmp)
rownames(tmp)<-c("m7 vs. m6","m7 vs. m8","m7 vs. m9")
#
# m10 vs. m7 -- is there a Position X Length interaction?
tmp <- rbind(tmp,relLik(m7, m10))
tmp <- as.data.frame(tmp)
rownames(tmp)<-c("m7 vs. m6","m7 vs. m8","m7 vs. m9","m7 vs. m10")
#
(comp2<-round(tmp,0))
# AIC(x) AIC(y) diff relLik
# m7 vs. m6 4586 4581 5 11
# m7 vs. m8 4586 4574 12 486
# m7 vs. m9 4586 4580 6 19
# m7 vs. m10 4586 4586 1 1
#
### there is no three-way interaction, there is a
### Length X WMC interaction there is a Position X WMC
### interaction, there is no Position X Length interaction
#
save(comp2, file = "smry/behavioralComparisons2.rda", compress = "xz")
#
# is there a Position X WMC interaction?
m11<-lmer(Recalled ~ rcs(Position, 3) * rcs(LengthBc, 3) + rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(Position, 3) | Subject)
+ (0 + rcs(Position, 3) | Item), data = vars, family = "binomial")
#
# is there a Position X LengthB interaction?
m12<-lmer(Recalled ~ rcs(Position, 3) + rcs(LengthBc, 3) + rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(Position, 3) | Subject)
+ (0 + rcs(Position, 3) | Item), data = vars, family = "binomial")
#
# is there a Position main effect?
m13<-lmer(Recalled ~ rcs(LengthBc, 3) + rcs(WMCc, 4) +
(1 | Subject) + (1 | Item), data = vars, family = "binomial")
#
# is there a LengthB main effect?
m14<-lmer(Recalled ~ rcs(Position, 3) + rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(Position, 3) | Subject)
+ (0 + rcs(Position, 3) | Item), data = vars, family = "binomial")
#
# is there a WMC main effect?
m15<-lmer(Recalled ~ rcs(Position, 3) + rcs(LengthBc, 3) +
(1 | Subject) + (1 | Item) + (0 + rcs(Position, 3) | Subject)
+ (0 + rcs(Position, 3) | Item), data = vars, family = "binomial")
#
tmp <- relLik(m8, m11)
#
tmp <- rbind(tmp,relLik(m11, m12))
tmp <- as.data.frame(tmp)
rownames(tmp)<-c("m8 vs. m11","m11 vs. m12")
#
tmp <- rbind(tmp,relLik(m12, m13))
tmp <- as.data.frame(tmp)
rownames(tmp)<-c("m8 vs. m11","m11 vs. m12","m12 vs. m13")
#
tmp <- rbind(tmp,relLik(m12, m14))
tmp <- as.data.frame(tmp)
rownames(tmp)<-c("m8 vs. m11","m11 vs. m12","m12 vs. m13",
"m12 vs. m14")
#
tmp <- rbind(tmp,relLik(m12, m15))
tmp <- as.data.frame(tmp)
rownames(tmp)<-c("m8 vs. m11","m11 vs. m12","m12 vs. m13",
"m12 vs. m14","m12 vs. m15")
#
(comp3<-round(tmp,0))
# AIC(x) AIC(y) diff relLik
# m8 vs. m11 4574 4567 7 3.200000e+01
# m11 vs. m12 4567 4565 2 3.000000e+00
# m12 vs. m13 4565 5061 -496 5.410225e+107
# m12 vs. m14 4565 4587 -22 6.557600e+04
# m12 vs. m15 4565 4572 -7 3.800000e+01
#
save(comp3, file = "smry/behavioralComparisons3.rda", compress = "xz")
#
# there are no interactions. all main effects are significant.
#
# is there an (0+rcs(LengthBc,3)|Subject) random effect?
m16<-lmer(Recalled ~ rcs(Position, 3) + rcs(LengthBc, 3) + rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(Position, 3) | Subject) + (0 + rcs(LengthBc, 3) | Subject) + (0 + rcs(Position, 3) | Item), data = vars, family = "binomial")
#
# is there an (0+rcs(WMCc,4)|Item) random effect?
m17<-lmer(Recalled ~ rcs(Position, 3) + rcs(LengthBc, 3) + rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(Position, 3) | Subject) + (0 + rcs(WMCc, 4) | Item) + (0 + rcs(Position, 3) | Item), data = vars, family = "binomial")
#
# is there an (0+rcs(Position,3)|Subject) random effect?
m18<-lmer(Recalled ~ rcs(Position, 3) + rcs(LengthBc, 3) + rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(Position, 3) | Item), data = vars,
family = "binomial")
#
# is there an (0+rcs(Position,3)|Item) random effect?
m19<-lmer(Recalled ~ rcs(Position, 3) + rcs(LengthBc, 3) + rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(Position, 3) | Subject),
data = vars, family = "binomial")
#
tmp <- relLik(m12, m16)
#
tmp <- rbind(tmp,relLik(m12, m17))
tmp <- as.data.frame(tmp)
rownames(tmp)<-c("m12 vs. m16","m12 vs. m17")
#
tmp <- rbind(tmp,relLik(m12, m18))
tmp <- as.data.frame(tmp)
rownames(tmp)<-c("m12 vs. m16","m12 vs. m17","m12 vs. m18")
#
tmp <- rbind(tmp,relLik(m12, m19))
tmp <- as.data.frame(tmp)
rownames(tmp)<-c("m12 vs. m16","m12 vs. m17","m12 vs. m18",
"m12 vs. m19")
#
(comp4<-round(tmp,0))
# AIC(x) AIC(y) diff relLik
# m12 vs. m16 4565 4582 -17 5.80000e+03
# m12 vs. m17 4565 4585 -20 2.26860e+04
# m12 vs. m18 4565 4675 -110 8.75686e+23
# m12 vs. m19 4565 4570 -5 1.50000e+01
#
save(comp4, file = "smry/behavioralComparisons4.rda", compress = "xz")
#
### Best model is model m12
# p values for main effects
anova(m12,m13)
# Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
# m13 8 5060.9 5111.8 -2522.4
# m12 16 4564.8 4666.5 -2266.4 512.13 8 < 2.2e-16 ***
#
anova(m12,m14)
# Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
# m14 14 4587.0 4676.0 -2279.5
# m12 16 4564.8 4666.5 -2266.4 26.182 2 2.064e-06 ***
#
anova(m12,m15)
# m15 13 4572.0 4654.7 -2273.0
# m12 16 4564.8 4666.5 -2266.4 13.27 3 0.004087 **
#
# save models
save(m11, file = "models/m11FzRCSbehavioral.rda", compress = "xz")
save(m12, file = "models/m12FzRCSbehavioral.rda", compress = "xz")
save(m13, file = "models/m13FzRCSbehavioral.rda", compress = "xz")
save(m14, file = "models/m14FzRCSbehavioral.rda", compress = "xz")
save(m15, file = "models/m15FzRCSbehavioral.rda", compress = "xz")
save(m16, file = "models/m16FzRCSbehavioral.rda", compress = "xz")
save(m17, file = "models/m17FzRCSbehavioral.rda", compress = "xz")
save(m18, file = "models/m18FzRCSbehavioral.rda", compress = "xz")
save(m19, file = "models/m19FzRCSbehavioral.rda", compress = "xz")
#
### plot results
library(languageR)
ylimit<-c(0,0.6)
pdf(file="figs/model_prob_behavioural.pdf")
par(mfrow=c(2,2))
plotLMER.fnc(m12,pred="LengthBc",ylim=ylimit,
ylab="Probability of Correct Recall",
xlabel="LengthB",xaxt="n",main="(A)")
# log odds are back-transformed to probabilities
# effect size (range) for LengthBc is 0.1005774
len1<-sort(unique(vars$LengthBc))
len1<-seq(min(len1),max(len1),length.out=6)
len2<-sort(unique(vars$LengthB))
len2<-round(seq(min(len2),max(len2),length.out=6))
axis(side=1,at=len1,labels=len2,cex.axis=1)
plotLMER.fnc(m12,pred="Position",ylim=ylimit,
ylab="Probability of Correct Recall",main="(B)")
# log odds are back-transformed to probabilities
# effect size (range) for Position is 0.4326353
plotLMER.fnc(m12,pred="WMCc",ylim=ylimit,
ylab="Probability of Correct Recall",xlabel="WMC",
xaxt="n",main="(C)")
# log odds are back-transformed to probabilities
# effect size (range) for WMCc is 0.2065505
wmc1<-sort(unique(vars$WMCc))
wmc1<-seq(min(wmc1),max(wmc1),length.out=6)
wmc2<-sort(unique(vars$WMC))
wmc2<-round(seq(min(wmc2),max(wmc2),length.out=6),2)
axis(side=1,at=wmc1,labels=wmc2,cex.axis=1)
par(mfrow=c(1,1))
dev.off()
@
}
}
We investigated whether Length, Position, WMC, and their interactions affected the probability of correctly recalling a four-word sequence. In order to be correctly recalled, a four-word sequence had to be recalled exactly. That is, if the target sequence was {\it in the middle of}, any response other than {\it in the middle of} was considered to be incorrect (e.g., {\it in the middle}, {\it in the middle and}, {\it in the middle of a}, or {\it at the middle of}). However, we accepted minor misspellings such as {\it in the mdle of} or {\it n the midle of}.
The optimal generalized LME model (for which linearity was \emph{not} assumed) included main effects of Length ($\chi_{(2)} = 28.2$, $p < 0.001$), Position ($\chi_{(8)} = 512.1$, $p < 0.001$), and WMC ($\chi_{(3)} = 13.3$, $p = 0.004$).\footnote{These statistics were obtained by way of log-likelihood ratio tests between models with and without the main effects. Moreover, models with the main effects always had an AIC value lower than models without the main effects.} Results of the behavioural analysis are depicted in Figure \ref{fig:behaviouralResults}. Length had a small, albeit significant, effect on the probability of correctly recalling a four-word sequence. As illustrated in panel A of Figure \ref{fig:behaviouralResults}, the probability of successful recall was roughly the same when the length of the second word was 1, 2, 3, and 4 letters, but started to decrease from there on as length increased. The Position effect shown in panel B exhibited a large recency effect whereby more recently presented four-word sequences were more likely to be correctly recalled. This possibly reflected an advantage in activation strength for the sequences that were presented more recently within a block \cite{Jones2013}. Finally, as can be seen in panel C, the greater the working memory capacity of a participant, the more likely she was to correctly remember a four-word sequence.
\begin{figure}[h]
\centering
\includegraphics{figs/model_prob_behavioural.pdf}
\caption{Results of the behavioural analysis. (A): Partial effect of Position on the probability of correctly recalling a four-word sequence. (B): Partial effect of Length. (C): Partial effect of Working Memory Capacity.}
\label{fig:behaviouralResults}
\end{figure}
}
{\subsection{ERP Results}
An {\tt mp4} movie of the scalp topographies through time can be downloaded from \texttt{http://hdl.handle.net/10222/22146}. Figure \ref{fig:topomap141ms} depicts the scalp topography at time $t = 141$ ms where the N1a was maximal. It is apparent from the scalp topography movie, as well as from the bottom panel of Figure \ref{fig:topomap141ms}, that a negative deflection around electrode Fz --- the N1a --- began at $t \thicksim 78$ ms, peaked at time $t = 141$ ms ($\thicksim -5 \mu V$), and returned to baseline at time $t \thicksim 180$ ms. Our window for analysis was thus chosen as 80--180 ms.
{% R code: install and load libraries, and format data
{\footnotesize
% eval=FALSE
<>=
options(width=60)
options(continue=" ")
@
<>=
### IF YOU ARE MAKING THIS DOCUMENT FOR THE FIRST TIME,
### SET ALL THE EVAL'S TO TRUE. IT WILL TAKE QUITE A BIT
### OF TIME TO MAKE. THEN RESET THE EVAL'S THAT ARE MARKED
### FALSE TO FALSE TO SAVE TIME WHEN RE-MAKING.
if(!try(require(LMERConvenienceFunctions, quietly = TRUE))){
install.packages("LMERConvenienceFunctions", dependencies = TRUE,
repos = "http://mirror.its.dal.ca/cran/")
}
library(LMERConvenienceFunctions)
#
# Only available from dalspace at the moment.
if(!try(require(EEGLMERdata, quietly = TRUE))){
tmp.dir <- tempdir()
download.file(url = "http://dalspace.library.dal.ca/bitstream/handle/10222/22146/EEGLMERdata_0.4.tar.gz?sequence=1", destfile = file.path(tmp.dir,
"EEGLMERdata_0.4.tar.gz"), method = "auto")
install.packages(file.path(tmp.dir, "EEGLMERdata_0.4.tar.gz"),
repos = NULL)
file.remove(file.path(tmp.dir, "EEGLMERdata_0.4.tar.gz"))
}
library(EEGLMERdata)
#
if(!try(require(xtable, quietly = TRUE))){
install.packages("xtable", dependencies = TRUE,
repos = "http://mirror.its.dal.ca/cran/")
}
library(xtable)
#
if(!try(require(fields, quietly = TRUE))){
install.packages("fields", dependencies = TRUE,
repos = "http://mirror.its.dal.ca/cran/")
}
library(xtable)
#
dir.create("data")
dir.create("models")
dir.create("tables")
dir.create("smry")
@
% eval=FALSE
<>=
### will need about 6--8 GB of RAM to do this portion
# load data set
data(eegWide)
dat<-eegWide
rm(eegWide)
gc(TRUE,TRUE)
data(vars)
dat <- merge(dat[,c("Subject", "Item", "Stimulus", "Time", "Trial", "Fp1",
"Fp2", "AF3", "AF4", "F7", "F3", "Fz", "F4", "F8", "FC5", "FC1", "FC2",
"FC6", "T7", "C3", "Cz", "C4", "T8", "CP5", "CP1", "CP2", "CP6", "P7", "P3",
"Pz", "P4", "P8", "PO3", "PO4", "O1", "Oz", "O2")],vars[,c("Item", "Subject",
"WMC", "Position", "LengthB", "Recalled")], by = c("Item", "Subject"))
#
# put in long format
electrodes <- c("Fp1", "Fp2", "AF3", "AF4", "F7", "F3", "Fz", "F4", "F8",
"FC5", "FC1", "FC2", "FC6", "T7", "C3", "Cz", "C4", "T8", "CP5", "CP1",
"CP2", "CP6", "P7", "P3", "Pz", "P4", "P8", "PO3", "PO4", "O1", "Oz", "O2")
temp <- dat[, c("Subject", "Item", "Stimulus", "Time", "Trial",
"Position", "WMC", "LengthB", "Recalled")]
for(e in 1:length(electrodes)){
cat(e, "\n")
if(e == 1){
temp2 <- cbind(temp, dat[, electrodes[e]])
colnames(temp2)[ncol(temp2)] <- "Amplitude"
temp2$Electrode <- electrodes[e]
}else{
temp3 <- cbind(temp, dat[, electrodes[e]])
colnames(temp3)[ncol(temp3)] <- "Amplitude"
temp3$Electrode <- electrodes[e]
temp2 <- rbind(temp2, temp3)
rm(temp3); gc(TRUE, TRUE)
}
}
dat <- temp2
dat$Electrode <- as.factor(dat$Electrode)
dat <- dat[order(dat$Subject, dat$Electrode, dat$Position, dat$Time), ]
subj<-sort(unique(dat$Subject))
for(i in subj){
cat("processing subject", i, "\n")
tmp <- dat[dat$Subject == i, , drop=TRUE]
tmp$newfact <- paste(tmp$Block, tmp$Position, sep = "_")
newvec <- vector("numeric")
for(j in 1:length(unique(tmp$newfact))){
cat("\tnewfact",j,"\n")
newvec <- c(newvec, rep(j, nrow(tmp[tmp$newfact ==
unique(tmp$newfact)[j], ])))
}
tmp$Trial <- newvec
if(grep(i, subj)[1] == 1){
newdat <- tmp
}else{
newdat <- rbind(newdat, tmp)
}
}
dat<-newdat
# per subject trimming
dat <- perSubjectTrim.fnc(dat, response = "Amplitude",
subject = "Subject", trim = 2.5)$data
# n.removed = 186565
# percent.removed = 1.800765
#
dat <- dat[ , c("Subject", "Item", "Time", "Trial", "Position", "WMC",
"LengthB", "Electrode", "Amplitude", "Recalled")]
# save trimmed data
save(dat, file = "data/eegLong.rda", compress = "xz")
@
}
}
{% R code: create topomaps
{\footnotesize
% eval=FALSE
<>=
# produce plot
library(EEGLMERdata)
data(eegLong)
dat<-eegLong
rm(eegLong)
gc(TRUE,TRUE)
# create average
electrodes <- unique(dat$Electrode)
time <- sort(unique(dat$Time))
for(i in 1:length(electrodes)){
cat("processing electrode", electrodes[i], "\n")
tmp <- dat[dat$Electrode == electrodes[i], ]
x <- tapply(tmp$Amplitude, tmp$Time, mean)
if(i == 1){
avg <- data.frame(Time = time, Amplitude = x, Electrode = electrodes[i])
}else{
avg <- rbind(avg, data.frame(Time = time, Amplitude = x,
Electrode = electrodes[i]))
}
}
rownames(avg) <- 1:nrow(avg)
#
# define cartesian coordinate for Biosemi 32 electrodes
coords <- data.frame(x = c(-0.496189,-0.545830,-1.299041,-0.659023,-0.394923,
-1.173172,-1.605703,-0.802851,-0.394923,-1.173172,-1.299041,-0.659023,
0.000000,-0.545830,-0.496189,0.000000,0.496189,0.545830,0.659023,1.299041,
1.173172,0.394923,0.802851,1.605703,1.173172,0.394923,0.659023,1.299041,
0.545830,0.496189,0.000000,0.000000),y=c(1.527114,1.170536,0.943808,
0.813825,0.394923,0.450338,-0.000000,-0.000000,-0.394923,-0.450338,
-0.943808,-0.813825,-0.802851,-1.170536,-1.527114,-1.605703,-1.527114,
-1.170536,-0.813825,-0.943808,-0.450338,-0.394923,0.000000,0.000000,
0.450338,0.394923,0.813825,0.943808,1.170536,1.527114,0.802851,0.000000),
Channel = c("Fp1","AF3","F7", "F3","FC1","FC5", "T7","C3","CP1","CP5",
"P7","P3","Pz","PO3","O1","Oz","O2", "PO4","P4","P8", "CP6","CP2","C4",
"T8","FC6","FC2","F4","F8","AF4","Fp2", "Fz","Cz"))
#
# merge
avg <- merge(avg, cart, by.x = "Electrode", by.y = "Channel")
rownames(avg) <- 1:nrow(avg)
save(avg, file = "data/avg.rda", compress = "xz")
#
# model the topomaps
m <- gam(Amplitude ~ te(Time, x, y, bs = "cs", k = c(40, 5, 5)), data = avg)
# save model
save(m, file = "models/topomap_with_baseline_model.rda", compress = "xz")
#
#
### creatre mask
w<-matrix(1,ncol=400,nrow=400)
cnt<-200
cnt2<-20
cnt3<-390
for(ii in 1:6){
for(jj in 1:8){
cnt<-cnt+1
if(jj==1){
cnt2<-cnt2+1
}
if(jj==1){
cnt3<-cnt3-1
}
w[cnt,cnt2:cnt3]<-NA
}
}
for(ii in 1:26){
for(jj in 1:2){
cnt<-cnt+1
if(jj==1){
cnt2<-cnt2+1
}
if(jj==1){
cnt3<-cnt3-1
}
w[cnt,cnt2:cnt3]<-NA
}
}
for(ii in 1:18){
for(jj in 1:2){
cnt<-cnt+1
if(jj==1){
cnt2<-cnt2+2
}
if(jj==1){
cnt3<-cnt3-2
}
w[cnt,cnt2:cnt3]<-NA
}
}
for(ii in 1:7){
for(jj in 1:2){
cnt<-cnt+1
if(jj==1){
cnt2<-cnt2+3
}
if(jj==1){
cnt3<-cnt3-3
}
w[cnt,cnt2:cnt3]<-NA
}
}
for(ii in 1:3){
for(jj in 1:2){
cnt<-cnt+1
if(jj==1){
cnt2<-cnt2+5
}
if(jj==1){
cnt3<-cnt3-5
}
w[cnt,cnt2:cnt3]<-NA
}
}
for(ii in 1:8){
for(jj in 1:2){
cnt<-cnt+1
if(jj==1){
cnt2<-cnt2+6
}
if(jj==1){
cnt3<-cnt3-6
}
w[cnt,cnt2:cnt3]<-NA
}
}
cnt<-201
cnt2<-20
cnt3<-390
for(ii in 1:6){
for(jj in 1:8){
cnt<-cnt-1
if(jj==1){
cnt2<-cnt2+1
}
if(jj==1){
cnt3<-cnt3-1
}
w[cnt,cnt2:cnt3]<-NA
}
}
for(ii in 1:26){
for(jj in 1:2){
cnt<-cnt-1
if(jj==1){
cnt2<-cnt2+1
}
if(jj==1){
cnt3<-cnt3-1
}
w[cnt,cnt2:cnt3]<-NA
}
}
for(ii in 1:18){
for(jj in 1:2){
cnt<-cnt-1
if(jj==1){
cnt2<-cnt2+2
}
if(jj==1){
cnt3<-cnt3-2
}
w[cnt,cnt2:cnt3]<-NA
}
}
for(ii in 1:7){
for(jj in 1:2){
cnt<-cnt-1
if(jj==1){
cnt2<-cnt2+3
}
if(jj==1){
cnt3<-cnt3-3
}
w[cnt,cnt2:cnt3]<-NA
}
}
for(ii in 1:3){
for(jj in 1:2){
cnt<-cnt-1
if(jj==1){
cnt2<-cnt2+5
}
if(jj==1){
cnt3<-cnt3-5
}
w[cnt,cnt2:cnt3]<-NA
}
}
for(ii in 1:8){
for(jj in 1:2){
cnt<-cnt-1
if(jj==1){
cnt2<-cnt2+6
}
if(jj==1){
cnt3<-cnt3-6
}
w[cnt,cnt2:cnt3]<-NA
}
}
write.table(w,file="biosemi.32.mask.txt",quote=F,sep="\t",row.names=F,col.names=F)
#
#
### create frames for movie
load("models/topomap_with_baseline_model.rda")
load("data/avg.rda")
dat <- m$model
time <- sort(unique(dat$Time))
zlimit = c(-5.5, 5.5)
ylimit = c(-4.5, 4.5)
coords <- data.frame(x = c(-0.496189,-0.545830,-1.299041,-0.659023,-0.394923,
-1.173172,-1.605703,-0.802851,-0.394923,-1.173172,-1.299041,-0.659023,
0.000000,-0.545830,-0.496189,0.000000,0.496189,0.545830,0.659023,1.299041,
1.173172,0.394923,0.802851,1.605703,1.173172,0.394923,0.659023,1.299041,
0.545830,0.496189,0.000000,0.000000),y=c(1.527114,1.170536,0.943808,
0.813825,0.394923,0.450338,-0.000000,-0.000000,-0.394923,-0.450338,
-0.943808,-0.813825,-0.802851,-1.170536,-1.527114,-1.605703,-1.527114,
-1.170536,-0.813825,-0.943808,-0.450338,-0.394923,0.000000,0.000000,
0.450338,0.394923,0.813825,0.943808,1.170536,1.527114,0.802851,0.000000),
Channel = c("Fp1","AF3","F7", "F3","FC1","FC5", "T7","C3","CP1","CP5",
"P7","P3","Pz","PO3","O1","Oz","O2", "PO4","P4","P8", "CP6","CP2","C4",
"T8","FC6","FC2","F4","F8","AF4","Fp2", "Fz","Cz"))
electrodes <- as.character(coords$Channel)
np <- 400
#pal <- colorRampPalette(c("blue","white","red"))(75)
pal <- tim.colors(200)
x <- seq(min(coords$x)-1, max(coords$x)+1, length.out = np)
y <- seq(min(coords$y)-1, max(coords$y)+1, length.out = np)
library(fields)
for(i in 1:length(time)){
cat("processing Time",i,"\n")
png(filename = paste("figs/topomaps/Time",i,".png",sep=""),
height=960,width=960)
newdat <- expand.grid(x, y, Time = time[i])
colnames(newdat) <- c("x","y","Time")
X <- predict(m, newdat, type = "lpmatrix")
X <- X %*% coef(m)
# create average matrix
newdat <- data.frame(newdat[, c("x", "y")], Amplitude = X)
mat<-tapply(newdat$Amplitude, list(newdat$x, newdat$y), mean)
xm<-as.numeric(rownames(mat))
ym<-as.numeric(colnames(mat))
#
layout(matrix(c(1,1,1,1,1,1,2,2), 4, 2, byrow = TRUE))
par(mar = c(0.1, 5.1, 3.1, 17.1))
image(xm,ym,mat,axes=FALSE,xlab="",ylab="",col=pal,zlim=zlimit)
contour(xm,ym,mat,add=T,nlevels=25,labcex=0.01,col="black")
image(xm,ym,w,add=T,col="white")
# draw head
F1 <- c(0, 0.75)
F2 <- c(0, -0.7)
eRp:::ellipse(F1[1], F1[2], F2[1], F2[2],
k=3.75, col="black", lwd=4)
eRp:::ellipse(F1[1], F1[2], F2[1], F2[2],
k=4.75, col="white", lwd=15)
# left ear
F1 <- c(-1.5, 0.05)
F2 <- c(-2.1, 0.05)
eRp:::ellipse(F1[1], F1[2], F2[1], F2[2], k=0.1,
col="black", lwd=4)
# right ear
F1 <- c(1.5, 0.05)
F2 <- c(2.1, 0.05)
eRp:::ellipse(F1[1], F1[2], F2[1], F2[2], k=0.1,
col="black", lwd=4)
# draw nose
segments(-0.145, 1.9, 0, 2.15, lwd = 4)
segments(0, 2.15, 0.145, 1.9, lwd = 4)
#
cex.e<-1.75
for(e in 1:nrow(coords)){
text(coords[e,"x"], coords[e, "y"], coords[e,
"Channel"], cex = cex.e, col = "black")
}
image.plot(xm,ym,mat,axes=FALSE,xlab="",ylab="",col=pal,
zlim=zlimit,legend.args=list(eval(parse(text=paste("expression(",
paste("Amplitude~~(",expression(mu * V),")",sep=""),")",sep=""))),
cex=2.25,line=3.5,side=1,adj=0.9),legend.width=2,legend.shrink=0.5,
axis.args=list(cex.axis=1.75),legend.only=TRUE,graphics.reset=TRUE,
legend.mar=8.1)
close.screen(all.screens=TRUE)
par(new=TRUE)
layout(matrix(c(1,1,1,1,1,1,2,2), 4, 2, byrow = TRUE))
frame()
par(mar = c(3.1, 4.1, 0.1, 2.1))
#
# now plot subplot
for(e in electrodes){
tmp <- avg[avg$Electrode==e,]
m.tmp <- gam(Amplitude ~ s(Time, bs = "cs", k = 40),
data = tmp)
pi.tmp <- plotGAM(m.tmp, plot = FALSE)
if(which(electrodes == e) == 1){
plot(pi.tmp$x, pi.tmp$y, col = which(electrodes == e),
ylim = ylimit, type = "l", ylab = "", cex.axis = 2.25,
lwd = 2)
}else{
lines(pi.tmp$x, pi.tmp$y, col = which(electrodes == e),
lwd = 2)
}
}
abline(v=0, lty=3, lwd = 2)
abline(v = time[i], lwd = 4)
mtext(paste("Time =", round(time[i],0),"ms"),side=3,line=72,adj=0.5,cex=2)
dev.off()
}
# In a bash terminal (Linux), copy and paste the following command
# to produce an mp4 movie.
#
# avconv -y -r 15 -i Time%01d.png -c:v libx264 -preset slow -crf 21 ../Topomaps.mp4
#
# -y = overwrite
# -r 5 = 5 frames per second
# -i Time%01d.png = input files (Time1.png to the last number
# there is int he file names
# -c:v libx264 = use lib264 for encoding
# -preset slow = to get a better quality movie
# -crf 0 = the quality of the movie from 0 (the best) to 51 (the worst)
#
#
# you can also do it directly in R with the following commands:
#
setwd("figs/topomaps")
system("avconv -y -r 10 -i Time%01d.png -c:v libx264 -preset slow -crf 0 ../Topomaps.mp4")
setwd("../..")
#
#
### create PNG for time = 141 ms, i.e., figure 18.
i<-32
np <- 400
pal <- tim.colors(200)
x <- seq(min(coords$x)-1, max(coords$x)+1, length.out = np)
y <- seq(min(coords$y)-1, max(coords$y)+1, length.out = np)
time <- sort(unique(dat$Time))
zlimit = c(-5.5, 5.5)
ylimit = c(-4.5, 4.5)
library(fields)
#
newdat <- expand.grid(x, y, Time = time[i])
colnames(newdat) <- c("x","y","Time")
X <- predict(m, newdat, type = "lpmatrix")
X <- X %*% coef(m)
# create average matrix
newdat <- data.frame(newdat[, c("x", "y")], Amplitude = X)
mat<-tapply(newdat$Amplitude, list(newdat$x, newdat$y), mean)
xm<-as.numeric(rownames(mat))
ym<-as.numeric(colnames(mat))
save(mat,file="data/plotmat141ms.rda")
#
#
load("data/plotmat141ms.rda")
xm<-as.numeric(rownames(mat))
ym<-as.numeric(colnames(mat))
#
png(filename = "figs/Time141ms.png", height=960, width=960)
layout(matrix(c(1,1,1,1,1,1,2,2), 4, 2, byrow = TRUE))
par(mar = c(0.1, 5.1, 3.1, 17.1))
image(xm,ym,mat,axes=FALSE,xlab="",ylab="",col=pal,zlim=zlimit)
contour(xm,ym,mat,add=T,nlevels=25,labcex=0.01,col="black")
image(xm,ym,w,add=T,col="white")
# draw head
F1 <- c(0, 0.75)
F2 <- c(0, -0.7)
eRp:::ellipse(F1[1], F1[2], F2[1], F2[2],
k=3.75, col="black", lwd=4)
eRp:::ellipse(F1[1], F1[2], F2[1], F2[2],
k=4.75, col="white", lwd=15)
# left ear
F1 <- c(-1.5, 0.05)
F2 <- c(-2.1, 0.05)
eRp:::ellipse(F1[1], F1[2], F2[1], F2[2], k=0.1,
col="black", lwd=4)
# right ear
F1 <- c(1.5, 0.05)
F2 <- c(2.1, 0.05)
eRp:::ellipse(F1[1], F1[2], F2[1], F2[2], k=0.1,
col="black", lwd=4)
# draw nose
segments(-0.145, 1.9, 0, 2.15, lwd = 4)
segments(0, 2.15, 0.145, 1.9, lwd = 4)
#
cex.e<-1.75
for(e in 1:nrow(coords)){
text(coords[e,"x"], coords[e, "y"], coords[e,
"Channel"], cex = cex.e, col = "black")
}
image.plot(xm,ym,mat,axes=FALSE,xlab="",ylab="",col=pal,zlim=zlimit,
legend.args=list(eval(parse(text=paste("expression(",
paste("Amplitude~~(",expression(mu * V),")",sep=""),")",sep=""))),
cex=2.25,line=3.5,side=1,adj=0.9),legend.width=2,legend.shrink=0.5,
axis.args=list(cex.axis=1.75),legend.only=TRUE,graphics.reset=TRUE,
legend.mar=8.1)
close.screen(all.screens=TRUE)
par(new=TRUE)
layout(matrix(c(1,1,1,1,1,1,2,2), 4, 2, byrow = TRUE))
frame()
par(mar = c(3.1, 4.1, 0.1, 2.1))
#
# now plot subplot
for(e in electrodes){
tmp <- avg[avg$Electrode==e,]
m.tmp <- gam(Amplitude ~ s(Time, bs = "cs", k = 40),
data = tmp)
pi.tmp <- plotGAM(m.tmp, plot = FALSE)
if(which(electrodes == e) == 1){
plot(pi.tmp$x, pi.tmp$y, col = which(electrodes == e),
ylim = ylimit, type = "l", ylab = "", cex.axis = 2.25,
lwd = 2)
}else{
lines(pi.tmp$x, pi.tmp$y, col = which(electrodes == e),
lwd = 2)
}
}
abline(v=0, lty=3, lwd = 2)
abline(v = time[i], lwd = 4)
dev.off()
@
}
}
\begin{figure}[tbh]
\centering
\includegraphics{figs/Time141ms.png}
\caption{Scalp topography at time $t = 141$ ms where the N1a was maximal. Blue colors indicate negative amplitudes while red colors indicate positive amplitudes. White indicates an amplitude of 0 $\mu V$. The bottom panel shows the a\-ve\-rage waveform of each of the 32 electrodes overlaid on top of each other (each electrodes is graphed with a different color). The {\it x}-axis represents time in milliseconds and the {\it y}-axis is amplitude in $\mu V$. A vertical black bar marks the peak of the N1a component.}
\label{fig:topomap141ms}
\end{figure}
{% R code: plotMGCVsmoothTweaked
{\footnotesize
% eval=FALSE
<>=
# code borrowed from
# https://svn.r-project.org/R-packages/trunk/mgcv/R/plots.r
plotMGCVsmoothTweaked<<-function(x,select=1,se=TRUE,se1.mult=2,
seWithMean=FALSE,partial.resids=FALSE,n=100){
i<-select
first <- x$smooth[[i]]$first.para
last <- x$smooth[[i]]$last.para
edf <- sum(x$edf[first:last]) ## Effective DoF for this term
P<-mgcv:::plot.mgcv.smooth(x$smooth[[i]],P=NULL,
data=x$model,se=se,se1.mult=se1.mult,
seWithMean=seWithMean,n=n)
p <- x$coefficients[first:last] ## relevent coefficients
offset <- attr(P$X,"offset") ## any term specific offset
## get fitted values ....
if (is.null(offset)) P$fit <- P$X%*%p else P$fit <- P$X%*%p + offset
if (!is.null(P$exclude)) P$fit[P$exclude] <- NA
if (se && P$se) { ## get standard errors for fit
## test whether mean variability to be added to variability
## (only for centred terms)
if (seWithMean && attr(x$smooth[[i]],"nCons")>0) {
X1 <- matrix(x$cmX,nrow(P$X),ncol(x$Vp),byrow=TRUE)
meanL1 <- x$smooth[[i]]$meanL1
if (!is.null(meanL1)) X1 <- X1 / meanL1
X1[,first:last] <- P$X
se.fit <- sqrt(rowSums((X1%*%x$Vp)*X1))
} else se.fit <- ## se in centred (or anyway unconstained) space only
sqrt(rowSums((P$X%*%x$Vp[first:last,first:last])*P$X))
if (!is.null(P$exclude)) P$se.fit[P$exclude] <- NA
} ## standard errors for fit completed
if (partial.resids) { P$p.resid <- fv.terms[,length(order)+i] + w.resid }
if (se && P$se) P$se <- se.fit*P$se.mult # Note multiplier
ul <- P$fit + P$se ## upper CL
ll <- P$fit - P$se ## lower CL
ylimit<-c(min(ll),max(ul))
return(invisible(list(x=P$x,y=P$fit,y1a=ll,y1b=ul,ylim=ylimit,
xlab=P$xlab,xlim=P$xlim,ylab=P$ylab,main=P$main,se.fit=P$se)))
}
@
}
}
{% R code: barplots and waveforms
{\footnotesize
% eval=FALSE
<>=
data(eegLong)
dat<-eegLong
rm(eegLong)
gc(TRUE,TRUE)
# mean center LengthB and dichotomize
dat$LengthBc <- dat$LengthB-mean(dat$LengthB)
dat$LengthBdc <- "short"
dat$LengthBdc[dat$LengthBc>median(dat$LengthBc,
na.rm = TRUE)] <- "long"
dat$LengthBdc <- as.factor(dat$LengthBdc)
#
# mean center WMC and dichotomize
dat$WMCc <- dat$WMC-mean(dat$WMC)
dat$WMCdc <- "low"
dat$WMCdc[dat$WMCc>median(dat$WMCc,
na.rm = TRUE)] <- "high"
dat$WMCdc <- as.factor(dat$WMCdc)
#
dat0<-dat
dat0<-dat0[dat0$Electrode=="Fz",]
rm(dat)
gc(TRUE,TRUE)
ylimit <- c(-7.5, 1)
ylimit2 <- c(-7,0)
### main effects
pdf(file="figs/FzMainEffects.pdf")
op <- par()$mar
par(mar = c(3.1, 4.1, 2.1, 3.1))
par(mfrow = c(3,2))
nk <- 25
tw <- c(80, 180)
e<-"Fz"
# Length
tmp<-dat0[dat0$Time<=200&dat0$Electrode==e,]
tmp$Position<-as.factor(tmp$Position)
tmp$newfac <- tmp$LengthBdc
avg <- tapply(tmp$Amplitude, list(tmp$newfac, tmp$Time), mean)
x <- cbind(avg[1, ], avg[2, ])
x <- cbind(x, as.numeric(rownames(x)))
x <- as.data.frame(x)
rownames(x) <- 1:nrow(x)
colnames(x) <- c(as.character(levels(tmp$newfac)), "Time")
xx <- list()
for(i in 1:nlevels(tmp$newfac)){
eval(parse(text = paste("xx[[i]] <-
plotMGCVsmoothTweaked(gam(", as.character(levels(
tmp$newfac))[i],"~s(Time, k = nk), data = x))",
sep = "")))
}
# waveforms
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "n", ylim = ylimit, xlab = "", ylab = "",
cex.axis=1.5,cex.lab=1.5,main="")
rect(tw[1], ylimit[1], tw[2], ylimit[2], density = NA,
col = grey(0.85))
par(new = TRUE)
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "l", ylim = ylimit, xlab = "",
ylab = "",cex.lab=1.5, axes=FALSE,main="")
mtext("Time",side=1,line=-1.5,cex=1.1)
mtext(eval(parse(text=paste("expression(",
paste("Amplitude~~(",expression(mu * V), ")",
sep=""),")",sep=""))),side=2,cex=1.25,line=2.5)
mtext("Length", side=3,line = 0.5,cex=1.25)
for(i in 2:nlevels(tmp$newfac)){
lines(xx[[i]]$x, xx[[i]]$y + mean(avg[i, ]), col = i)
}
abline(v=0,lty=3,col="grey")
# barplot
gr.avg <- tapply(dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$Amplitude,
dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$LengthBdc,mean)
barplot(gr.avg, col = 1:nlevels(dat0$LengthBdc),
ylab = "", ylim = ylimit2, legend.text = levels(dat0$LengthBdc),
args.legend = list(x="bottomleft", ncol = 4, cex = 1.75,
bty = "n", title = "",xjust = 1), names.arg = FALSE, cex.axis = 1.5)
# Position
#ylimit <- c(-6.5, 6.5)
#ylimit2 <- c(-3.5,0)
tmp<-dat0[dat0$Time<=200&dat0$Electrode==e,]
tmp$Position<-as.factor(tmp$Position)
tmp$newfac <- tmp$Position
avg <- tapply(tmp$Amplitude, list(tmp$newfac, tmp$Time), mean)
x <- cbind(avg[1, ], avg[2, ])
for(i in 3:nlevels(tmp$newfac)){
x <- cbind(x, avg[i, ])
}
x <- cbind(x, as.numeric(rownames(x)))
x <- as.data.frame(x)
rownames(x) <- 1:nrow(x)
colnames(x) <- c(paste("Pos",as.character(levels(tmp$newfac)),sep=""), "Time")
xx <- list()
for(i in 1:nlevels(tmp$newfac)){
eval(parse(text = paste("xx[[i]] <- plotMGCVsmoothTweaked(gam(",
paste("Pos", as.character(levels( tmp$newfac))[i],sep=""),
"~s(Time, k = nk), data = x))", sep = "")))
}
# waveform
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "n", ylim = ylimit, xlab = "", ylab = "",
cex.axis=1.5,cex.lab=1.5,main="")
rect(tw[1], ylimit[1], tw[2], ylimit[2], density = NA,
col = grey(0.85))
par(new = TRUE)
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "l", ylim = ylimit, xlab = "",
ylab = "",cex.lab=1.5, axes=FALSE,main="")
mtext("Position", side=3,line = 0.5,cex=1.25)
for(i in 2:nlevels(tmp$newfac)){
lines(xx[[i]]$x, xx[[i]]$y + mean(avg[i, ]), col = i)
}
abline(v=0,lty=3,col="grey")
# barplot
gr.avg <- tapply(dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$Amplitude,
as.factor(dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$Position),mean)
barplot(gr.avg, col = 1:nlevels(as.factor(dat0$Position)),
ylab = "", ylim = ylimit2, legend.text = levels(as.factor(dat0$Position)),
args.legend = list(x="bottomleft", ncol = 4, cex = 1.75,
bty = "n", title = "",xjust = 1), names.arg = FALSE, cex.axis = 1.5)
# WMC
#ylimit <- c(-5, 4.5)
#ylimit2 <- c(-2.75,0)
tmp<-dat0[dat0$Time<=200&dat0$Electrode==e,]
tmp$Position<-as.factor(tmp$Position)
tmp$newfac <- tmp$WMCdc
avg <- tapply(tmp$Amplitude, list(tmp$newfac, tmp$Time), mean)
x <- cbind(avg[1, ], avg[2, ])
x <- cbind(x, as.numeric(rownames(x)))
x <- as.data.frame(x)
rownames(x) <- 1:nrow(x)
colnames(x) <- c(as.character(levels(tmp$newfac)), "Time")
xx <- list()
for(i in 1:nlevels(tmp$newfac)){
eval(parse(text = paste("xx[[i]] <-
plotMGCVsmoothTweaked(gam(", as.character(levels(
tmp$newfac))[i],"~s(Time, k = nk), data = x))",
sep = "")))
}
# waveform
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "n", ylim = ylimit, xlab = "", ylab = "",
cex.axis=1.5,cex.lab=1.5,main="")
rect(tw[1], ylimit[1], tw[2], ylimit[2], density = NA,
col = grey(0.85))
par(new = TRUE)
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "l", ylim = ylimit, xlab = "",
ylab = "",cex.lab=1.5, axes=FALSE,main="")
mtext("Working Memory Capacity", side=3,line = 0.5,cex=1)
for(i in 2:nlevels(tmp$newfac)){
lines(xx[[i]]$x, xx[[i]]$y + mean(avg[i, ]), col = i)
}
abline(v=0,lty=3,col="grey")
# barplot
gr.avg <- tapply(dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$Amplitude,
dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$WMCdc,mean)
barplot(gr.avg, col = 1:nlevels(dat0$WMCdc),
ylab = "", ylim = ylimit2, legend.text = levels(dat0$WMCdc),
args.legend = list(x="bottomleft", ncol = 4, cex = 1.75,
bty = "n", title = "",xjust = 1), names.arg = FALSE, cex.axis = 1.5)
dev.off()
gc(TRUE,TRUE)
# interactions
pdf(file="figs/FzInteractions.pdf")
op <- par()$mar
par(mar = c(3.1, 4.1, 2.1, 3.1))
par(mfrow = c(4,2))
nk <- 25
tw <- c(80, 180)
e<-"Fz"
# Length X Position
#ylimit <- c(-7.5,7.5)
#ylimit2 <- c(-5.5,0)
tmp<-dat0[dat0$Time<=200&dat0$Electrode==e,]
tmp$Position<-as.factor(tmp$Position)
tmp$newfac <- as.factor(paste(tmp$LengthBdc,
tmp$Position, sep = "_"))
avg <- tapply(tmp$Amplitude, list(tmp$newfac, tmp$Time), mean)
x <- cbind(avg[1, ], avg[2, ])
for(i in 3:nlevels(tmp$newfac)){
x <- cbind(x, avg[i, ])
}
x <- cbind(x, as.numeric(rownames(x)))
x <- as.data.frame(x)
rownames(x) <- 1:nrow(x)
colnames(x) <- c(as.character(levels(tmp$newfac)), "Time")
xx <- list()
for(i in 1:nlevels(tmp$newfac)){
eval(parse(text = paste("xx[[i]] <-
plotMGCVsmoothTweaked(gam(", as.character(levels(
tmp$newfac))[i],"~s(Time, k = nk), data = x))",
sep = "")))
}
# waveform
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "n", ylim = ylimit, xlab = "", ylab = "",
cex.axis=1.5,cex.lab=1.5,main="")
rect(tw[1], ylimit[1], tw[2], ylimit[2], density = NA,
col = grey(0.85))
par(new = TRUE)
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "l", ylim = ylimit, xlab = "",
ylab = "",cex.lab=1.5, axes=FALSE,main="")
mtext("Time",side=1,line=-1.5,cex=1.1)
mtext(eval(parse(text=paste("expression(",
paste("Amplitude~~(",expression(mu * V), ")",
sep=""),")",sep=""))),side=2,cex=1.25,line=2.5)
mtext("Length X Position", side=3,line = 0.5,cex=1.1)
for(i in 2:nlevels(tmp$newfac)){
lines(xx[[i]]$x, xx[[i]]$y + mean(avg[i, ]), col = i)
}
abline(v=0,lty=3,col="grey")
dat0$newfac<-as.factor(paste(dat0$LengthBdc,
as.factor(dat0$Position),sep="_"))
# barplot
gr.avg <- tapply(dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$Amplitude,
dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$newfac,mean)
barplot(gr.avg, col = 1:nlevels(dat0$newfac),
ylab = "", ylim = ylimit2, legend.text = levels(dat0$newfac),
args.legend = list(x="bottomleft", ncol = 4, cex = 1,
bty = "n", title = "",xjust = 1), names.arg = FALSE, cex.axis = 1.5)
rm(avg,gr.avg,tmp,xx,x)
gc(TRUE,TRUE)
# Length $\times$ WMC
#ylimit <- c(-5,5)
#ylimit2 <- c(-3,0)
tmp<-dat0[dat0$Time<=200&dat0$Electrode==e,]
tmp$newfac <- as.factor(paste(tmp$LengthBdc,
tmp$WMCdc, sep = "_"))
avg <- tapply(tmp$Amplitude, list(tmp$newfac, tmp$Time), mean)
x <- cbind(avg[1, ], avg[2, ])
for(i in 3:nlevels(tmp$newfac)){
x <- cbind(x, avg[i, ])
}
x <- cbind(x, as.numeric(rownames(x)))
x <- as.data.frame(x)
rownames(x) <- 1:nrow(x)
colnames(x) <- c(as.character(levels(tmp$newfac)), "Time")
xx <- list()
for(i in 1:nlevels(tmp$newfac)){
eval(parse(text = paste("xx[[i]] <-
plotMGCVsmoothTweaked(gam(", as.character(levels(
tmp$newfac))[i],"~s(Time, k = nk), data = x))",
sep = "")))
}
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "n", ylim = ylimit, xlab = "", ylab = "",
cex.axis=1.5,cex.lab=1.5,main="")
rect(tw[1], ylimit[1], tw[2], ylimit[2], density = NA,
col = grey(0.85))
par(new = TRUE)
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "l", ylim = ylimit, xlab = "",
ylab = "",cex.lab=1.5, axes=FALSE,main="")
mtext("Length X WMC", side=3,line = 0.5,cex=1.1)
for(i in 2:nlevels(tmp$newfac)){
lines(xx[[i]]$x, xx[[i]]$y + mean(avg[i, ]), col = i)
}
abline(v=0,lty=3,col="grey")
# barplot
dat0$newfac<-as.factor(paste(dat0$LengthBdc,dat0$WMCdc,sep="_"))
gr.avg <- tapply(dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$Amplitude,
dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$newfac,mean)
barplot(gr.avg, col = 1:nlevels(dat0$newfac),
ylab = "", ylim = ylimit2, legend.text = levels(dat0$newfac),
args.legend = list(x="bottomleft", ncol = 4, cex = 1.75,
bty = "n", title = "",xjust = 1), names.arg = FALSE,cex.axis=1.5)
rm(avg,gr.avg,tmp,xx,x)
gc(TRUE,TRUE)
# WMC X Position
#ylimit <- c(-7.5,7.5)
#ylimit2 <- c(-5,0)
tmp<-dat0[dat0$Time<=200&dat0$Electrode==e,]
tmp$Position<-as.factor(tmp$Position)
tmp$newfac <- as.factor(paste(tmp$WMCdc,
tmp$Position, sep = "_"))
avg <- tapply(tmp$Amplitude, list(tmp$newfac, tmp$Time), mean)
x <- cbind(avg[1, ], avg[2, ])
for(i in 3:nlevels(tmp$newfac)){
x <- cbind(x, avg[i, ])
}
x <- cbind(x, as.numeric(rownames(x)))
x <- as.data.frame(x)
rownames(x) <- 1:nrow(x)
colnames(x) <- c(as.character(levels(tmp$newfac)), "Time")
xx <- list()
for(i in 1:nlevels(tmp$newfac)){
eval(parse(text = paste("xx[[i]] <-
plotMGCVsmoothTweaked(gam(", as.character(levels(
tmp$newfac))[i],"~s(Time, k = nk), data = x))",
sep = "")))
}
# waveform
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "n", ylim = ylimit, xlab = "", ylab = "",
cex.axis=1.5,cex.lab=1.5,main="")
rect(tw[1], ylimit[1], tw[2], ylimit[2], density = NA,
col = grey(0.85))
par(new = TRUE)
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "l", ylim = ylimit, xlab = "",
ylab = "",cex.lab=1.5, axes=FALSE,main="")
mtext("WMC X Position", side=3,line = 0.5,cex=1.1)
for(i in 2:nlevels(tmp$newfac)){
lines(xx[[i]]$x, xx[[i]]$y + mean(avg[i, ]), col = i)
}
abline(v=0,lty=3,col="grey")
# barplot
dat0$newfac<-as.factor(paste(dat0$WMCdc,
as.factor(dat0$Position),sep="_"))
gr.avg <- tapply(dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$Amplitude,
dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$newfac,mean)
barplot(gr.avg, col = 1:nlevels(dat0$newfac),
ylab = "",
ylim = ylimit2, legend.text = levels(dat0$newfac),
args.legend = list(x="bottomleft", ncol = 4, cex = 1,
bty = "n", title = "",xjust = 1), names.arg = FALSE,cex.axis=1.5)
rm(avg,gr.avg,tmp,xx,x)
gc(TRUE,TRUE)
# Length X Position X WMC
#ylimit <- c(-7.5,7.5)
#ylimit2 <- c(-7,0)
tmp<-dat0[dat0$Time<=200&dat0$Electrode==e,]
tmp$Position<-as.factor(tmp$Position)
tmp$newfac <- as.factor(paste(tmp$LengthBdc,
tmp$Position, tmp$WMCdc, sep = "_"))
avg <- tapply(tmp$Amplitude, list(tmp$newfac, tmp$Time), mean)
x <- cbind(avg[1, ], avg[2, ])
for(i in 3:nlevels(tmp$newfac)){
x <- cbind(x, avg[i, ])
}
x <- cbind(x, as.numeric(rownames(x)))
x <- as.data.frame(x)
rownames(x) <- 1:nrow(x)
colnames(x) <- c(as.character(levels(tmp$newfac)), "Time")
xx <- list()
for(i in 1:nlevels(tmp$newfac)){
eval(parse(text = paste("xx[[i]] <-
plotMGCVsmoothTweaked(gam(", as.character(levels(
tmp$newfac))[i],"~s(Time, k = nk), data = x))",
sep = "")))
}
# waveform
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "n", ylim = ylimit, xlab = "", ylab = "",
cex.axis=1.5,cex.lab=1.5,main="")
rect(tw[1], ylimit[1], tw[2], ylimit[2], density = NA,
col = grey(0.85))
par(new = TRUE)
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "l", ylim = ylimit, xlab = "",
ylab = "",cex.lab=1.5, axes=FALSE,main="")
mtext("Length X Position X WMC", side=3,line = 0.5,cex=1)
for(i in 2:nlevels(tmp$newfac)){
lines(xx[[i]]$x, xx[[i]]$y + mean(avg[i, ]), col = i)
}
abline(v=0,lty=3,col="grey")
# barplot
dat0$newfac<-as.factor(paste(dat0$LengthBdc,as.factor(dat0$Position),
dat0$WMCdc,sep="_"))
gr.avg <- tapply(dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$Amplitude,
dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$newfac,mean)
barplot(gr.avg, col = 1:nlevels(dat0$newfac),
ylab = "", ylim = c(-9,0), legend.text = levels(dat0$newfac),
args.legend = list(x="bottomleft", ncol = 4, cex = 0.785,
bty = "n", title = "",xjust = 1, yjust = 1), names.arg = FALSE,
cex.axis = 1.5)
rm(avg,gr.avg,tmp,xx,x)
gc(TRUE,TRUE)
par(mfrow = c(1, 1))
par(mar = op)
dev.off()
@
}
}
Although our main goal in this paper is to demonstrate the application of nonlinear analysis using continuous variables, for descriptive purposes, as well as for comparison with traditional ANOVA-based approaches, we first show the data treated categorically. Thus Position was treated as having 6 levels while Length and WMC were dichotomized as high/low based on a median split. Figure \ref{fig:FzMain} shows waveforms and mean N1a amplitudes for each level of each of these three variables. No effect of Length is apparent in this figure. However, there do appear to be amplitude differences for different levels of Position (middle panel) --- in particular, items that appeared first in the lists show a more negative N1a than subsequent positions. As such, there may be a nonlinear relationship between Position and N1a amplitude. The bottom panel also suggests the possibility of an effect of WMC, where participants with a higher WMC showed a more negative N1a than those with a lower working memory capacity.
\begin{figure}[h]
\centering
\includegraphics{figs/FzMainEffects.pdf}
\caption{Waveforms (right panels) and mean amplitudes (left panel) of the N1a at electrode Fz, in the 80--180 ms window, plotted separately for discretized levels of each independent variable. The {\it x}-axis represents time in milliseconds and the {\it y}-axis is amplitude in $\mu V$.}
\label{fig:FzMain}
\end{figure}
Figure \ref{fig:FzInter} shows waveforms and mean amplitudes for two- and three-way combinations of levels of the independent variables (corresponding to the two- and three-way interactions in an ANOVA model). There appears to be a {\it Length} $\times$ {\it Position} interaction, where long words in first position elicited a more negative N1a than short words in first position. There may be a {\it Length} $\times$ {\it WMC} interaction and a {\it WMC} $\times$ {\it Position} interaction. In the former case, shorter words elicited a smaller N1a in participants with a low working memory capacity. In the latter case, the Position effect on N1a amplitudes was stronger in participants with high than low working memory capacity. Finally, there is possibly a three-way interaction (bottom panel), where long words in the first position elicited a more negative N1a in participants with low than high working memory capacity (the leftmost black and red bars).
\begin{figure}[h]
\centering
\includegraphics{figs/FzInteractions.pdf}
\caption{Two-way and three-way interactions
between dichotomized levels of {\it Length}, {\it Position} and {\it WMC}. Details are as in Figure \ref{fig:FzMain}.}
\label{fig:FzInter}
\end{figure}
}
{\subsection{LME Analysis Assuming Linearity}
Our first LME analysis simply treated each variable as continuous. Results are shown in Table~\ref{tab:ContSmry}. The model assuming linearity suggests that there was only a significant main effect of Position. While it is notable that a main effect for Position was found under the assumption of linearity, Figure \ref{fig:FzMain} suggested that this variable had a nonlinear relationship with N1a amplitude. Describing the source of this main effect would necessitate drawing inferences over the six levels of Position from a relatively large number of pairwise comparisons.\footnote{While contrasts such as Helmert could also describe this relationship somewhat more parsimoniously, we had no a priori assumption as to the shape of the relationship between Position and N1a amplitude so the choice of appropriate contrast weights would have to be made post hoc.}
{% R code: results table
{\footnotesize
<>=
load("tables/m1m2m3m4m5m6Fz.rda")
x<-as.data.frame(t(smry[,6]))
rownames(x)<-" "
colnames(x)<-rownames(smry)
print(xtable(x, caption = "Electrode Fz -- 80 to 180 ms. Probability (\\textit{p}) values from model for which linearity is assumed. \\textit{Pos} = \\textit{Position}. \\textit{Lngth} = \\textit{Length}, \\textit{WMC} = \\textit{Working Memory Capacity}.", label = "tab:ContSmry"))
@
}
}
{% R code: lmer models
{\footnotesize
% eval=FALSE
<>=
# We obtained the optimal model by comparing successively more complex models
# by way of AIC \cite[pp. 61--62]{Zuur2009}.
# This involves computing models with different main effects and interactions
# to determine which one best fits the data. For each model, (1) an initial
# model was fitted to the un-trimmed data, (2) outliers with a standardized
# residual at a distance greater than 2.5 standard deviations from 0 were
# excluded (representing 1.4\% of the data), and (3) the model was refitted on
# the trimmed data. The results of these tests are provided in
# Table~\ref{tab:ContAIC} below.
data(eegLong)
dat<-eegLong
rm(eegLong)
gc(TRUE,TRUE)
# restrict to electrode Fz and 80--180 ms window
dat <- dat[dat$Electrode == "Fz", ]
dat$Electrode<-as.factor(as.character(dat$Electrode))
dat <- dat[dat$Time >= 80 & dat$Time <= 180, ]
dat$LengthBc <- dat$LengthB - mean(dat$LengthB)
dat$WMCc <- dat$WMC - mean(dat$WMC)
dat$Position <- dat$Position - mean(dat$Position)
dat0 <- dat
#
# fit models
m1 <- lmer(Amplitude ~ Position * LengthBc * WMCc +
(1 | Subject), data = dat)
dat <- romr.fnc(m1, dat, 2.5)$data
# n.removed = 735
# percent.removed = 1.350706
m1 <- lmer(Amplitude ~ Position * LengthBc * WMCc +
(1 | Subject), data = dat)
#
dat <- dat0
m2 <- lmer(Amplitude ~ Position * LengthBc * WMCc +
(1 | Subject) + (1 | Item), data = dat)
dat <- romr.fnc(m2, dat, 2.5)$data
# n.removed = 733
# percent.removed = 1.34703
m2 <- lmer(Amplitude ~ Position * LengthBc * WMCc +
(1 | Subject) + (1 | Item), data = dat)
AIC(m1);AIC(m2);AIC(m1)-AIC(m2)
#
dat <- dat0
m3 <- lmer(Amplitude ~ Position * LengthBc * WMCc +
(1 | Subject) + (1 | Item) + (0 + WMCc | Item),
data = dat)
dat <- romr.fnc(m3, dat, 2.5)$data
# n.removed = 759
# percent.removed = 1.39481
m3 <- lmer(Amplitude ~ Position * LengthBc * WMCc +
(1 | Subject) + (1 | Item) + (0 + WMCc | Item),
data = dat)
AIC(m2);AIC(m3);AIC(m2)-AIC(m3)
#
dat <- dat0
m4 <- lmer(Amplitude ~ Position * LengthBc * WMCc +
(1 | Subject) + (1 | Item) + (0 + WMCc | Item) +
(0 + LengthBc | Subject), data = dat)
dat <- romr.fnc(m4, dat, 2.5)$data
# n.removed = 759
# percent.removed = 1.39481
m4 <- lmer(Amplitude ~ Position * LengthBc * WMCc +
(1 | Subject) + (1 | Item) + (0 + WMCc | Item) +
(0 + LengthBc | Subject), data = dat)
AIC(m3);AIC(m4);AIC(m3)-AIC(m4)
#
dat <- dat0
m5 <- lmer(Amplitude ~ Position * LengthBc * WMCc +
(1 | Subject) + (1 | Item) + (0 + WMCc | Item) +
(0 + Position | Subject), data = dat)
dat <- romr.fnc(m5, dat, 2.5)$data
# n.removed = 754
# percent.removed = 1.385622
m5 <- lmer(Amplitude ~ Position * LengthBc * WMCc +
(1 | Subject) + (1 | Item) + (0 + WMCc | Item) +
(0 + Position | Subject), data = dat)
AIC(m3);AIC(m5);AIC(m3)-AIC(m5)
#
dat <- dat0
m6 <- lmer(Amplitude ~ Position * LengthBc * WMCc +
(1 | Subject) + (1 | Item) + (0 + WMCc | Item) +
(0 + Position | Item), data = dat)
dat <- romr.fnc(m6, dat, 2.5)$data
# n.removed = 777
# percent.removed = 1.427889
m6 <- lmer(Amplitude ~ Position * LengthBc * WMCc +
(1 | Subject) + (1 | Item) + (0 + WMCc | Item) +
(0 + Position | Item), data = dat)
AIC(m3);AIC(m6);AIC(m3)-AIC(m6)
#
save(m1, file = "models/m1Fz.rda", compress = "xz")
save(m2, file = "models/m2Fz.rda", compress = "xz")
save(m3, file = "models/m3Fz.rda", compress = "xz")
save(m4, file = "models/m4Fz.rda", compress = "xz")
save(m5, file = "models/m5Fz.rda", compress = "xz")
save(m6, file = "models/m6Fz.rda", compress = "xz")
@
}
}
{% R code: lmer comparasion
{\footnotesize
% eval=FALSE
<>=
# The best model was {\it m6}, which included by-subject and by-item random
# intercepts (meaning that some subjects and stimulus items had a mean N1a
# amplitude that was substantially greater or smaller than the grand mean N1a
# amplitude), by-item random slopes for working memory capacity (the {\it WMC}
# effect was substantially greater or smaller than the a\-ve\-rage {\it WMC}
# effect for some stimulus items), as well as by-item random slopes for {\it
# Position} (the positional effect was substantially greater or smaller than
# the a\-ve\-rage {\it Position} effect for some stimulus items). The random
# effect variances are provided in Table~\ref{tab:ContREmat}.
models<-paste("m",1:6,"Fz.rda",sep="")
for(i in models){
load(file.path("models",i))
}
# m1 vs. m2
tmp <- anova(m1, m2)
class(tmp) <- "data.frame"
comp <- data.frame(Comparison = "m1 vs m2", AIC.x = tmp$AIC[1],
AIC.y = tmp$AIC[2], relLik = exp((abs(tmp$AIC[1]-tmp$AIC[2]))/2),
Chisq = tmp$Chisq[2], Chi.df = tmp$`Chi Df`[2],
p.value = tmp$`Pr(>Chisq)`[2])
#
# m2 vs. m3
tmp <- anova(m2, m3)
class(tmp) <- "data.frame"
comp <- rbind(comp, data.frame(Comparison = "m2 vs m3", AIC.x = tmp$AIC[1],
AIC.y = tmp$AIC[2], relLik = exp((abs(tmp$AIC[1]-tmp$AIC[2]))/2),
Chisq = tmp$Chisq[2], Chi.df = tmp$`Chi Df`[2],
p.value = tmp$`Pr(>Chisq)`[2]))
#
# m3 vs. m4
tmp <- anova(m3, m4)
class(tmp) <- "data.frame"
comp <- rbind(comp, data.frame(Comparison = "m3 vs m4", AIC.x = tmp$AIC[1],
AIC.y = tmp$AIC[2], relLik = exp((abs(tmp$AIC[1]-tmp$AIC[2]))/2),
Chisq = tmp$Chisq[2], Chi.df = tmp$`Chi Df`[2],
p.value = tmp$`Pr(>Chisq)`[2]))
#
# m3 vs. m5
tmp <- anova(m3, m5)
class(tmp) <- "data.frame"
comp <- rbind(comp, data.frame(Comparison = "m3 vs m5", AIC.x = tmp$AIC[1],
AIC.y = tmp$AIC[2], relLik = exp((abs(tmp$AIC[1]-tmp$AIC[2]))/2),
Chisq = tmp$Chisq[2], Chi.df = tmp$`Chi Df`[2],
p.value = tmp$`Pr(>Chisq)`[2]))
#
# m3 vs. m6
tmp <- anova(m3, m6)
class(tmp) <- "data.frame"
comp <- rbind(comp, data.frame(Comparison = "m3 vs m6", AIC.x = tmp$AIC[1],
AIC.y = tmp$AIC[2], relLik = exp((abs(tmp$AIC[1]-tmp$AIC[2]))/2),
Chisq = tmp$Chisq[2], Chi.df = tmp$`Chi Df`[2],
p.value = tmp$`Pr(>Chisq)`[2]))
rownames(comp) <- comp$Comparison
comp <- comp[, 2:7]
#
pval<-comp[,6]
tmp<-vector("character")
for(i in 1:length(pval)){
if(pval[i]<0.001){
tmp<-c(tmp,"< 0.001")
}else{
tmp<-c(tmp,as.character(round(pval[i],3)))
}
}
comp[,6]<-tmp
#
relLik<-comp[,3]
tmp<-vector("character")
for(i in 1:length(relLik)){
if(relLik[i]>1000){
tmp<-c(tmp,"> 1000")
}else{
tmp<-c(tmp,as.character(round(relLik[i],1)))
}
}
comp[,3]<-tmp
save(comp, file = "smry/fullLMER_anovaFz.rda", compress = "xz")
#
# get summaries
tmp <- pamer.fnc(m1)
tmp1 <- pamer.fnc(m2)
smry <- as.data.frame(cbind(tmp[,6],tmp1[,6]))
tmp <- pamer.fnc(m3)
smry <- as.data.frame(cbind(smry,tmp[,6]))
tmp <- pamer.fnc(m4)
smry <- as.data.frame(cbind(smry,tmp[,6]))
tmp <- pamer.fnc(m5)
smry <- as.data.frame(cbind(smry,tmp[,6]))
tmp <- pamer.fnc(m6)
smry <- as.data.frame(cbind(smry,tmp[,6]))
colnames(smry)<-c("m1","m2","m3","m4","m5","m6")
rownames(smry)<-rownames(tmp)
rownames(smry) <- gsub("Position", "Pos", rownames(smry))
rownames(smry) <- gsub("WMCc", "WMC", rownames(smry))
rownames(smry) <- gsub("LengthBc", "Lngth", rownames(smry))
smry<-round(smry,3)
smry[1,]<-rep("< .001",ncol(smry))
save(smry,file="tables/m1m2m3m4m5m6Fz.rda", compress = "xz")
REmat<-summary(m6)@REmat
REmat<-gsub(" ","",REmat)
REmat<-as.data.frame(REmat,stringsAsFactor=FALSE)
rownames(REmat)<-1:nrow(REmat)
REmat$Variance<-as.numeric(as.character(REmat$Variance))
REmat$Std.Dev.<-as.numeric(as.character(REmat$Std.Dev.))
REmat$Variance<-round(REmat$Variance,3)
REmat$Std.Dev.<-round(REmat$Std.Dev.,3)
REmat$Name <- gsub("WMCc", "WMC", REmat$Name)
save(REmat,file="tables/m6REmat.rda", compress = "xz")
@
}
}
{% R code: lmer comparision table
{\footnotesize
<>=
# Results of models {\it m1} to {\it m6} appear in Table~\ref{tab:ContSmry}.
# print table
load("smry/fullLMER_anovaFz.rda")
print(xtable(comp, caption = "Comparing models for which linearity is assumed with different random effects structures. Note that \\textit{relLik}, which stands for \\textit{relative likelihood}, is computed as \\textit{exp((abs(AIC(x) - AIC(y)))/2)}. It provides an index of how much more likely the model with the lowest AIC is. \\textbf{m1} = \\textit{(1 | Subject)}. \\textbf{m2} = \\textit{(1 | Subject) + (1 | Item)}. \\textbf{m3} = \\textit{(1 | Subject) + (1 | Item) + (0 + WMC | Item)}. \\textbf{m4} = \\textit{(1 | Subject) + (1 | Item) + (0 + WMC | Item) + (0 + Length | Subject)}. \\textbf{m5} = \\textit{(1 | Subject) + (1 | Item) + (0 + WMC | Item) + (0 + Position | Subject)}. \\textbf{m6} = \\textit{(1 | Subject) + (1 | Item) + (0 + WMC | Item) + (0 + Position | Item)}.", label = "tab:ContAIC", digits = c(0, 0, 0, 0, 1, 0, 3)))
@
}
}
{% R code: lmer REmat
{\footnotesize
<>=
# print table
load("tables/m6REmat.rda")
print(xtable(REmat, caption = "Random effect structure of model \\textbf{m6} =
\\textit{(1 | Subject) + (1 | Item) + (0 + WMC | Item) + (0 + Position |
Item)}. \\textit{Pos} = \\textit{Position}. \\textit{Lngth} =
\\textit{Length}, \\textit{WMC} = \\textit{Working Memory Capacity}.", label =
"tab:ContREmat"),include.rownames=FALSE)
@
}
}
Fitting a single nonlinear function to Position could simplify the description and interpretation of the effect without recourse to cumbersome post hoc testing. In addition, it may be the case that the assumption of linearity missed important structure in the data. In the next section, we explore LME models for which linearity is {\it not} assumed.
}
{\subsection{Moving Away from the Assumption of Linearity}
{% models without factor variables -- 3-way interaction
The possibility of nonlinear relationships between our predictor variables and N1a amplitude is supported by the Figure \ref{fig:ContAvg}, in which mean N1a amplitude at each sampled level of each variable are plotted, along with a form of best-fitting nonlinear functions (lowess smooths).\footnote{A lowess smooth uses locally-weighted polynomial regression. `Local' is defined as a window of a certain span around each data point (e.g., 50 data points to the left and to the right). Each data point within a window is influenced (i.e., weighted) by its ``neighbours''. See {\tt ?lowess} in {\tt R} for more details.} The Length effect (top left panel) appears to be curved with relatively constant (and small) N1a amplitudes from values 1 to roughly 6, after which N1a amplitude increases with increasing word length.\footnote{Because the N1a has a negative polarity, we refer to more negative amplitudes as ``increases''.} The effect of Position (bottom left panel) shows an opposite pattern. The amplitude of the N1a is most negative in the first position, less so in second position, and then flattens out in the third through sixth positions -- even becoming slightly more negative at the last position. The WMC effect (top right panel) is more complex, with an initial increase in N1a amplitude from the lowest to low-middle WMC levels, followed by a decrease and then a subsequent increase for people with the highest WMC.
\begin{figure}[h]
\centering
\includegraphics[width=0.7\textwidth,height=0.55\textheight]{figs/ContinuousAverages.pdf}
\caption{N1a amplitude a\-ve\-rages as a function of each one of the predictor variables, treated as continuous (in contrast with the discretized forms of these variables shown in Figure \ref{fig:FzMain}). The black circles are the mean amplitude at each measured level of each variable. The red lines are {\tt lowess} smooths of the a\-ve\-rages.}
\label{fig:ContAvg}
\end{figure}
As introduced above, in our LME analysis we modelled these nonlinear relationships with restricted cubic splines (implemented by function {\tt rcs} in {\tt R}). As a first step these functions needed to be parameterized. That is, we needed to determine how many knots we would use for each predictor. For each cubic spline, we used 2 knots plus the number of inflection points found in the lowess curves (i.e., the red lines in Figure \ref{fig:ContAvg}). Therefore, for Length and Position we used a restricted cubic spline function of degree 3 (i.e., three knots, one at the $1^{st}$, $50^{th}$, and $100^{th}$ quantiles of each variable's distribution), and for WMC a degree of 4 (i.e., knots at the $1^{st}$, $25^{th}$, $75^{th}$, and $100^{th}$ quantiles). Setting the knots at quantile locations ensured that each ``window'' contained an equal number of data points (this is automatically done in function {\tt rcs}). The choice of the number of knots was driven by the shape of the empirically-derived lowess fits (i.e., not driven by investigator assumptions), with additional knots allowing for a more wiggly fit, if necessary. In cases where this degree of fit was unwarranted, the parameters fit to the functions simply allow for a flatter (less convoluted) function.
{% R code: compare lin vs nonlin
<>=
load("models/m6Fz.rda")
m6a <- m6
load("models/m6FzRCS.rda")
tmp <- anova(m6a, m6)
class(tmp) <- "data.frame"
tmp<-data.frame(Comparison = "m6a vs m6", AIC.x = tmp$AIC[2],
AIC.y = tmp$AIC[1], relLik = exp((abs(tmp$AIC[2]-tmp$AIC[1]))/2),
Chisq = tmp$Chisq[2], Chi.df = tmp$`Chi Df`[2],
p.value = tmp$`Pr(>Chisq)`[2])
tmp
# Comparison AIC.x AIC.y relLik Chisq Chi.df p.value
# 1 m6a vs m6 355858.8 356270.1 2.069304e+89 487.3146 38 2.343428e-79
@
}
{% R code: fitting the models
{\footnotesize
% eval=FALSE
<>=
data(eegLong)
dat<-eegLong
rm(eegLong)
gc(TRUE,TRUE)
# restrict to electrode Fz and 80--180 ms window
dat <- dat[dat$Electrode == "Fz", ]
dat$Electrode<-as.factor(as.character(dat$Electrode))
dat <- dat[dat$Time >= 80 & dat$Time <= 180, ]
pdf("figs/ContinuousAverages.pdf")
par(mfrow=c(2,2))
# Amplitude ~ Length
avg<-tapply(dat$Amplitude,dat$LengthB,mean)
plot(as.numeric(names(avg)),avg,xlab="Length",ylab="",
main="Amplitude ~ Length",ylim=c(-4,-1.25),cex.lab=1.25,cex.main=1.5,
cex.axis=1.25)
mtext(eval(parse(text=paste("expression(",paste("Amplitude~~(",
expression(mu * V), ")",sep=""),")",sep=""))),side=2,cex=1.25,
line=2.25)
lines(lowess(as.numeric(names(avg)),avg),col=2)
legend("bottomleft",legend=c("average","lowess smooth"),
lty=c(NA,1),pch=c(21,NA),col=1:2,bty="n",cex=1.25)
# Amplitude ~ WMC
avg<-tapply(dat$Amplitude,dat$WMC,mean)
plot(as.numeric(names(avg)),avg,xlab="WMC",ylab="",main="Amplitude ~ WMC",
ylim=c(-2.6,-0.5),cex.lab=1.25,cex.main=1.5,cex.axis=1.25)
lines(lowess(as.numeric(names(avg)),avg),col=2)
legend("topleft",legend=c("average","lowess smooth"),
lty=c(NA,1),pch=c(21,NA),col=1:2,bty="n",cex=1.25)
# Amplitude ~ Position
avg<-tapply(dat$Amplitude,dat$Position,mean)
plot(as.numeric(names(avg)),avg,xlab="Position",ylab="",
main="Amplitude ~ Position",ylim=c(-2.75,-1.7),cex.lab=1.25,
cex.main=1.5,cex.axis=1.25)
lines(lowess(as.numeric(names(avg)),avg),col=2)
legend("bottomright",legend=c("average","lowess smooth"),
lty=c(NA,1),pch=c(21,NA),col=1:2,bty="n",cex=1.25)
par(mfrow=c(1,1))
dev.off()
#
# mena center things
dat$LengthBc <- dat$LengthB - mean(dat$LengthB)
dat$WMCc <- dat$WMC - mean(dat$WMC)
dat$Position <- dat$Position - mean(dat$Position)
dat0 <- dat
#
# fit models
library(rms)
m1 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject), data = dat)
dat <- romr.fnc(m1, dat, 2.5)$data
# n.removed = 739
# percent.removed = 1.358056
m1 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject), data = dat)
#
dat <- dat0
m2 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject) + (1 | Item), data = dat)
dat <- romr.fnc(m2, dat, 2.5)$data
# n.removed = 744
# percent.removed = 1.367245
m2 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject) + (1 | Item), data = dat)
anova(m1,m2)
#
dat <- dat0
m3 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(WMCc, 4) | Item), data = dat)
dat <- romr.fnc(m3, dat, 2.5)$data
# n.removed = 760
# percent.removed = 1.396648
m3 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(WMCc, 4) | Item), data = dat)
anova(m2,m3)
#
dat <- dat0
m4 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(WMCc, 4) | Item) +
(0 + rcs(LengthBc, 3) | Subject), data = dat)
dat <- romr.fnc(m4, dat, 2.5)$data
# n.removed = 759
# percent.removed = 1.39481
m4 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(WMCc, 4) | Item) +
(0 + rcs(LengthBc, 3) | Subject), data = dat)
anova(m3,m4)
#
dat <- dat0
m5 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(WMCc, 4) | Item) +
(0 + rcs(Position, 3) | Subject), data = dat)
dat <- romr.fnc(m5, dat, 2.5)$data
# n.removed = 760
# percent.removed = 1.396648
m5 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(WMCc, 4) | Item) +
(0 + rcs(Position, 3) | Subject), data = dat)
anova(m3,m5)
#
dat <- dat0
m6 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(WMCc, 4) | Item) +
(0 + rcs(Position, 3) | Subject)+ (0 + rcs(Position, 3) | Item),
data = dat)
outliers <- romr.fnc(m6, dat, 2.5)
outliers <- as.numeric(rownames(outliers$data))
save(outliers, file = "data/m6_outliers.rda", compress = "xz")
dat <- romr.fnc(m6, dat, 2.5)$data
# n.removed = 795
# percent.removed = 1.460967
m6 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4) +
(1 | Subject) + (1 | Item) + (0 + rcs(WMCc, 4) | Item) +
(0 + rcs(Position, 3) | Subject)+ (0 + rcs(Position, 3) | Item),
data = dat)
anova(m5,m6)
#
save(m1, file = "models/m1FzRCS.rda", compress = "xz")
save(m2, file = "models/m2FzRCS.rda", compress = "xz")
save(m3, file = "models/m3FzRCS.rda", compress = "xz")
save(m4, file = "models/m4FzRCS.rda", compress = "xz")
save(m5, file = "models/m5FzRCS.rda", compress = "xz")
save(m6, file = "models/m6FzRCS.rda", compress = "xz")
#
# look at different k values for rcs(WMC)
dat <- dat0
m7 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 3) +
(1 | Subject) + (1 | Item) + (0 + rcs(WMCc, 3) | Item) +
(0 + rcs(Position, 3) | Subject)+ (0 + rcs(Position, 3) | Item),
data = dat)
dat <- romr.fnc(m7, dat, 2.5)$data
# n.removed = 780
# percent.removed = 1.433402
m7 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 3) +
(1 | Subject) + (1 | Item) + (0 + rcs(WMCc, 3) | Item) +
(0 + rcs(Position, 3) | Subject)+ (0 + rcs(Position, 3) | Item),
data = dat)
anova(m6,m7)
#
save(m7, file = "models/m7FzRCS.rda", compress = "xz")
@
}
}
{% R code: model comparisons
{\footnotesize
% eval=FALSE
<>=
# m1 vs. m2
tmp <- anova(m1, m2)
class(tmp) <- "data.frame"
comp <- data.frame(Comparison = "m1 vs m2", AIC.x = tmp$AIC[1],
AIC.y = tmp$AIC[2], relLik = exp((abs(tmp$AIC[1]-tmp$AIC[2]))/2),
Chisq = tmp$Chisq[2], Chi.df = tmp$`Chi Df`[2],
p.value = tmp$`Pr(>Chisq)`[2])
#
# m2 vs. m3
tmp <- anova(m2, m3)
class(tmp) <- "data.frame"
comp <- rbind(comp, data.frame(Comparison = "m2 vs m3", AIC.x = tmp$AIC[1],
AIC.y = tmp$AIC[2], relLik = exp((abs(tmp$AIC[1]-tmp$AIC[2]))/2),
Chisq = tmp$Chisq[2], Chi.df = tmp$`Chi Df`[2],
p.value = tmp$`Pr(>Chisq)`[2]))
#
# m3 vs. m4
tmp <- anova(m3, m4)
class(tmp) <- "data.frame"
comp <- rbind(comp, data.frame(Comparison = "m3 vs m4", AIC.x = tmp$AIC[1],
AIC.y = tmp$AIC[2], relLik = exp((abs(tmp$AIC[1]-tmp$AIC[2]))/2),
Chisq = tmp$Chisq[2], Chi.df = tmp$`Chi Df`[2],
p.value = tmp$`Pr(>Chisq)`[2]))
#
# m3 vs. m5
tmp <- anova(m3, m5)
class(tmp) <- "data.frame"
comp <- rbind(comp, data.frame(Comparison = "m3 vs m5", AIC.x = tmp$AIC[1],
AIC.y = tmp$AIC[2], relLik = exp((abs(tmp$AIC[1]-tmp$AIC[2]))/2),
Chisq = tmp$Chisq[2], Chi.df = tmp$`Chi Df`[2],
p.value = tmp$`Pr(>Chisq)`[2]))
#
# m5 vs. m6
tmp <- anova(m5, m6)
class(tmp) <- "data.frame"
comp <- rbind(comp, data.frame(Comparison = "m5 vs m6", AIC.x = tmp$AIC[1],
AIC.y = tmp$AIC[2], relLik = exp((abs(tmp$AIC[1]-tmp$AIC[2]))/2),
Chisq = tmp$Chisq[2], Chi.df = tmp$`Chi Df`[2],
p.value = tmp$`Pr(>Chisq)`[2]))
#
# m6 vs. m7
tmp <- anova(m6, m7)
class(tmp) <- "data.frame"
comp <- rbind(comp, data.frame(Comparison = "m7 vs m6", AIC.x = tmp$AIC[1],
AIC.y = tmp$AIC[2], relLik = exp((abs(tmp$AIC[1]-tmp$AIC[2]))/2),
Chisq = tmp$Chisq[2], Chi.df = tmp$`Chi Df`[2],
p.value = tmp$`Pr(>Chisq)`[2]))
rownames(comp) <- comp$Comparison
comp <- comp[, 2:7]
#
pval<-comp[,6]
tmp<-vector("character")
for(i in 1:length(pval)){
if(pval[i]<0.001){
tmp<-c(tmp,"< 0.001")
}else{
tmp<-c(tmp,as.character(round(pval[i],3)))
}
}
comp[,6]<-tmp
#
relLik<-comp[,3]
tmp<-vector("character")
for(i in 1:length(relLik)){
if(relLik[i]>1000){
tmp<-c(tmp,"> 1000")
}else{
tmp<-c(tmp,as.character(round(relLik[i],1)))
}
}
comp[,3]<-tmp
save(comp, file = "smry/rcsComparisons.rda", compress = "xz")
#
# look at model diagnostic plots
png("figs/m6_mcp.png")
mcp.fnc(m6)
dev.off()
# good model criticism plots
#
# get summaries
tmp <- pamer.fnc(m1)
tmp1 <- pamer.fnc(m2)
smry <- as.data.frame(cbind(tmp[,6],tmp1[,6]))
tmp <- pamer.fnc(m3)
smry <- as.data.frame(cbind(smry,tmp[,6]))
tmp <- pamer.fnc(m4)
smry <- as.data.frame(cbind(smry,tmp[,6]))
tmp <- pamer.fnc(m5)
smry <- as.data.frame(cbind(smry,tmp[,6]))
tmp <- pamer.fnc(m6)
smry <- as.data.frame(cbind(smry,tmp[,6]))
colnames(smry)<-c("m1","m2","m3","m4","m5","m6")
rownames(smry)<-rownames(tmp)
rownames(smry) <- gsub("Position", "Pos", rownames(smry))
rownames(smry) <- gsub("WMCc", "WMC", rownames(smry))
rownames(smry) <- gsub("LengthBc", "Lngth", rownames(smry))
rownames(smry) <- gsub(" ", "", rownames(smry))
rownames(smry) <- gsub("rcs\\(", "", rownames(smry))
rownames(smry) <- gsub(",3)", "", rownames(smry))
rownames(smry) <- gsub(",4)", "", rownames(smry))
smry<-round(smry,3)
smry[1,]<-rep("< .001",ncol(smry))
smry[5,]<-c(rep("< .001",4),smry[5,5],"< 0.001")
save(smry,file="tables/m1m2m3m4m5m6RCS.rda", compress = "xz")
#
# random effect structure
REmat<-summary(m6)@REmat
REmat<-gsub(" ","",REmat)
REmat<-as.data.frame(REmat,stringsAsFactor=FALSE)
rownames(REmat)<-1:nrow(REmat)
REmat$Variance<-as.numeric(as.character(REmat$Variance))
REmat$Std.Dev.<-as.numeric(as.character(REmat$Std.Dev.))
REmat$Variance<-round(REmat$Variance,3)
REmat$Std.Dev.<-round(REmat$Std.Dev.,3)
colnames(REmat)<-c(colnames(REmat)[1:5]," "," ")
REmat$Name <- gsub("WMCc", "WMC", REmat$Name)
save(REmat,file="tables/rcsREmat.rda", compress = "xz")
@
}
}
{% R code: comparison table
{\footnotesize
<>=
# print table
load("smry/rcsComparisons.rda")
print(xtable(comp, caption = "Comparing models for which linearity is not assumed with different random effect complexities. Note that \\textit{relLik}, which stands for \\textit{relative likelihood}, is computed as \\textit{exp((abs(AIC(x) - AIC(y)))/2)}. It provides an indexe of how much more likely the model with the lowest AIC is. \\textbf{m1} = \\textit{(1 | Subject)}. \\textbf{m2} = \\textit{(1 | Subject) + (1 | Item)}. \\textbf{m3} = \\textit{(1 | Subject) + (1 | Item) + (0 + rcs(WMC, 4) | Item)}. \\textbf{m4} = \\textit{(1 | Subject) + (1 | Item) + (0 + rcs(WMCc, 4) | Item) + (0 + rcs(LengthBc, 3) | Subject)}. \\textbf{m5} = \\textit{(1 | Subject) + (1 | Item) + (0 + rcs(WMCc, 4) | Item) + (0 + rcs(Position, 3) | Subject)}. \\textbf{m6} = \\textit{(1 | Subject) + (1 | Item) + (0 + rcs(WMCc, 4) | Item) + (0 + rcs(Position, 3) | Subject) + (0 + rcs(Position, 3) | Item)}. \\textbf{m7} = \\textit{(1 | Subject) + (1 | Item) + (0 + rcs(WMCc, 3) | Item) + (0 + rcs(Position, 3) | Subject) + (0 + rcs(Position, 3) | Item)}.", label = "tab:rcsAIC", digits = c(0, 0, 0, 0, 1, 0, 3)))
@
}
}
The base model had the form {\it Amplitude as a function of rcs(Position, 3) $\times$ rcs(Length, 3) $\times$ rcs(WMC, 4)}, that is, a three way interaction between restricted cubic splines for Position, Length, and WMC. This model was much more likely than the model for which linearity was assumed (model not assuming linearity: {\it df} = 51, AIC = 355859; model assuming linearity: {\it df} = 13, AIC = 356270; $\chi^2$ = 487.3, {\it df} = 38, {\it p} $< 0.001$). Results of this model appears in Table~\ref{tab:rcsSmry}. The plot of the three-way interaction is shown in Figure \ref{fig:m6Plot}.
{% R code: remat
{\footnotesize
<>=
# print table
load("tables/rcsREmat.rda")
print(xtable(REmat, caption = "Random effect structure of model \\textbf{m6} = \\textit{(1 | Subject) + (1 | Item) + (0 + rcs(WMC, 4) | Item) + (0 + rcs(Position, 3) | Subject)+ (0 + rcs(Position, 3) | Item)}. \\textit{Length} and \\textit{Position} have two \\textit{rcs} terms, whereas \\textit{Position} has three.", label = "tab:rcsREmat"), include.rownames=FALSE)
@
}
}
{% R code: print summary table
{\footnotesize
<>=
load("tables/m1m2m3m4m5m6RCS.rda")
x<-as.data.frame(t(smry[,6]))
rownames(x)<-" "
colnames(x)<-rownames(smry)
print(xtable(x, caption = "Probability (\\textit{p}-)values from the model for which linearity was not assumed. \\textit{Pos} = \\textit{Position}. \\textit{Lngth} = \\textit{Length}, \\textit{WMC} = \\textit{Working Memory Capacity}. \\textit{rcs} = \\textit{restricted cubic spline}.", label = "tab:rcsSmry"))
@
}
}
{% R code: create RCS plots
{\footnotesize
% eval=FALSE
<>=
data(eegLong)
dat<-eegLong
rm(eegLong)
gc(TRUE,TRUE)
# restrict to electrode Fz and 80--180 ms window
dat <- dat[dat$Electrode == "Fz", ]
dat$Electrode<-as.factor(as.character(dat$Electrode))
dat <- dat[dat$Time >= 80 & dat$Time <= 180, ]
dat$LengthBc <- dat$LengthB - mean(dat$LengthB)
dat$WMCc <- dat$WMC - mean(dat$WMC)
dat$Position <- dat$Position - mean(dat$Position)
library(rms)
library(fields)
n=100
data(vars)
wmcO<-seq(min(vars$WMC),max(vars$WMC),length.out=n)
wmc<-seq(min(dat$WMCc),max(dat$WMCc),length.out=n)
lengthBO<-seq(min(vars$LengthB),max(vars$LengthB),length.out=n)
lengthB<-seq(min(dat$LengthBc),max(dat$LengthBc),length.out=n)
positionO<-1:6
position<-sort(unique(dat$Position))
# multiply model matrix by coefficients to get predicted values
# without the random effects
load("data/m6_outliers.rda")
dat <- dat[outliers,]
rownames(dat) <- 1:nrow(dat)
load("models/m6FzRCS.rda")
dat$fitted<-m6@X%*%fixef(m6)
# put this into a model to use predictrms
m.ols<-ols(fitted~rcs(Position, 3) * rcs(LengthBc, 3) * rcs(WMCc, 4),data=dat)
# the coefficients from m.ols == those from fixef(m6)
cor(fixef(m6),coef(m.ols))
# [1] 1
# plot for each position
pdf("figs/m6FzRCS.pdf")
par(mfrow=c(3,2),mar=c(4.1,4.1,3.1,1.1))
zlimit<-c(-6,6)
for(p in 1:length(position)){
# create newdata for Position == p
mygrid<-expand.grid(wmc,lengthB)
mygrid$Position<-position[p]
colnames(mygrid)<-c("WMCc","LengthBc","Position")
mygrid$Predicted<-predict(m.ols,mygrid)
# reformat to use image
m<-matrix(NA,ncol=length(wmc),nrow=length(lengthB))
for(i in 1:nrow(m)){ # lengthB
for(j in 1:ncol(m)){ # wmc
x<-mygrid[mygrid$LengthBc==lengthB[i]&mygrid$WMCc==wmc[j],"Predicted"]
if(length(x)>0){
m[i,j]<-x
}
}
}
# plot
len<-vector("numeric")
for(k in 1:12){
len<-c(len,lengthB[which(lengthBO==k)])
}
image.plot(lengthBO,wmcO,m,col=topo.colors(50),main=paste("Position",p),
xlab=ifelse(p%in%c(1,4,5),"Length (# letters)",""),ylab=ifelse(p%in%c(1,3,
5),"WMC (from 0.50 to 0.90)",""),zlim=zlimit,legend.args=list(text=
eval(parse(text = paste("expression(",paste("Amplitude~~(",
expression(mu * V),")",sep=""),")",sep=""))), line=1.5,cex=0.65))
contour(lengthBO,wmcO,m,nlevels=15,add=TRUE)
}
par(mfrow=c(1,1),mar=c(5.1,4.1,4.1,2.1))
dev.off()
@
}
}
\begin{figure}[h]
\centering
\includegraphics[width=0.8\textwidth,height=0.6\textheight]{figs/m6FzRCS.pdf}
\caption{{\it Length $\times$ WMC} interaction of the best-fitting model ({\it m6}) from the nonlinear analysis. Each panel shows the {\it Length $\times$ WMC} interaction at one of the six levels of Position. The amplitude of the N1a is shown using the same color coding as in Figure \ref{fig:topomap141ms}, with the scale shown in the figure. The small numbers on the black lines are isovoltage lines with the voltage in microvolts provided.}
\label{fig:m6Plot}
\end{figure}
As in the analysis assuming linearity, here we found a significant main effect of Position. This confirmed the pattern observed in both Figures \ref{fig:FzMain} (discretized variables) and \ref{fig:ContAvg} (continuous variables) whereby N1a amplitudes were largest for words in the first position but were more or less consistent across subsequent positions. The lack of significant main effects for Length or WMC are also consistent with the linear analysis, suggesting that although the nonlinear curves shown in Figure \ref{fig:ContAvg} appeared to describe the data more closely than a straight line would have, nevertheless these relationships were not statistically reliable.
More striking differences between the linear and nonlinear analyses reside in the interaction structure of the models. The linear model did not include any significant interactions, while in the nonlinear model both the two-way {\it Position $\times$ WMC} and the three-way {\it Position $\times$ Length $\times$ WMC} interactions were significant. Recall that these interactions were predicted above based on our examination of the plots shown in Figure \ref{fig:FzInter}. These interactions are provided in Figure \ref{fig:m6Plot}. To display the three-way interaction, we have plotted the {\it Length $\times$ WMC} interaction at each level of Position. Given the presence of a three-way interaction, we did not attempt to interpret the two-way {\it Position $\times$ WMC} interaction since this is necessarily conditioned on the further interaction of these variables by Length.
Figure \ref{fig:m6Plot} shows a complex relationship between these three variables. The stronger N1a at Position 1 is clear from the larger area of dark blue in the panel for this position, however the interaction shows that this effect was maximal for the longest words, and further was greater for people with WMC in the middle of the sampled range, with somewhat smaller N1a amplitudes at both higher and lower WMC (though because fewer participants had extreme values on any given measure, parameter estimates are less robust at both ends of the WMC range). The nonlinear relationship between Length and N1a amplitude observed in Figure \ref{fig:ContAvg} is evident in these plots as well -- the spacing between the isovoltage lines is greater at shorter lengths, indicating a slower rate of amplitude increase, compared to higher values of Length. A similar relationship between the three variables is evident in the panel for Position 2, though with lower amplitudes overall. At later positions, however, a different relationship between WMC and N1a amplitude emerged. While N1a amplitude still increased in a nonlinear fashion with Length, at Positions 3--6 people with lower WMC showed larger N1a amplitudes than those with higher WMC. Indeed, at least up to Position 4 people with the lowest WMC showed consistent N1a amplitudes ($\sim$ 3 $\mu$V for longest words), while at later levels of position people with higher WMC showed little evidence for an N1a at all (voltage close to 0 $\mu$V).
To summarize these results, we can say that the N1a was robust across subjects at early positions in each sequence of four-word chunks, and larger for longer words. However, as more chunks were presented within a block, N1a amplitudes for people with higher WMC decreased to near zero, while the N1a persisted in people with lower WMC through all six chunks within a block.\footnote{Although at Position 6, voltage values at the lowest WMC decreased relative to slightly higher WMC values, as noted earlier the pattern at the tails of WMC must be viewed with skepticism as only one participant had the lowest WMC score.} These results are consistent with the task demands of the study and what we know about working memory and attention. As noted earlier, the N1a is associated with attention, as attended items typically elicit higher N1a amplitudes \cite[and references cited therein]{Luck2005, Penolazzi2007}. In this study, participants were required to recall as many of the four-word chunks as possible at the end of each block of six chunks. If we take the N1a as indexing the amount of attentional resources allocated to the task of storing each four-word sequence into short-term memory for later recall, then we can interpret the effects of length as indicating that more attentional resources were required to encode and retain longer words in short-term memory. The effect of Position is consistent with the ``primacy'' effect in working memory \cite{Deese1957} whereby the first-presented items in a list receive more attention and tend to be better-remembered. Likewise, the effects of WMC indicate that people with lower working memory capacity needed to allocate more attentional resources to encoding chunks presented later within a block, particularly for longer words. People with higher WMC capacity, on the other hand, actually appeared to allocate less attentional resources to chunks in later positions, especially those with longer words.
In the behavioural analysis, we had found that Position, LengthB, and WMC affected the probability of a four-word sequence being correctly recalled. More specifically, (i) more recently presented sequences were more likely to be correctly recalled, (ii) sequences for which the second word was longer were less likely to be correctly recalled, and (iii) participants with a higher working memory capacity were more likely to correctly recall a sequence. Our ability to relate the ERP results back to behaviour, however, is seriously compromised by the fact that we conflated successfully- and unsuccessfully-recalled sequences. In the following section, we investigate whether the recall status of a sequence affected N1a amplitudes and whether correctly and incorrectly recalled sequences were differentially affected by Position, Length, and WMC.
}
{\subsection{Interactions with a Dichotomous (Factor) Variable}
The preceding section demonstrated a nonlinear analysis of three continuous predictor variables, and how a three-way interaction can be visualized and interpreted. In some cases, however, some predictors are naturally factorial rather than continuous. For example, in the present study participants performed a later recognition judgement task --- so accuracy on this task for each chunk could be included as an additional, factorial (correct/incorrect) predictor. Inclusion of factorial predictors is readily handled within the LME framework, and using the {\it lmer} function this is achieved simply by ensuring that the factorial variable is treated as type ``factor'' (as opposed to ``numeric'' as continuous variables are). The additional inclusion of the factorial variable Recalled allowed us to address two additional questions: (1) Is there a relationship between N1a amplitudes and the successful (or unsuccessful) recall of a four-word sequence, and (2) in the event where such a relationship exists, is it modulated by a participant's working memory capacity, the length of the second word of a sequence, and/or the position where a four-word sequence was presented?
{% R code: model with factor variables -- 4-way interaction
{\footnotesize
% eval=FALSE
<>=
data(eegLong)
dat<-eegLong
rm(eegLong)
gc(TRUE,TRUE)
# restrict to electrode Fz and 80--180 ms window
dat <- dat[dat$Electrode == "Fz", ]
dat$Electrode<-as.factor(as.character(dat$Electrode))
dat <- dat[dat$Time >= 80 & dat$Time <= 180, ]
dat$LengthBc <- dat$LengthB - mean(dat$LengthB)
dat$WMCc <- dat$WMC - mean(dat$WMC)
dat$Position <- dat$Position - mean(dat$Position)
dat0 <- dat
#
#model with Recalled
dat <- dat0
m8 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) *
rcs(WMCc, 4) * Recalled + (1 | Subject) + (1 | Item) + (0 + rcs(WMCc,
4) | Item) + (0 + rcs(Position, 3) | Subject) + (0 + rcs(Position,
3) | Item), data = dat)
dat <- romr.fnc(m8, dat, 2.5)$data
# n.removed = 788
# percent.removed = 1.448103
m8 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) *
rcs(WMCc, 4) * Recalled + (1 | Subject) + (1 | Item) + (0 + rcs(WMCc,
4) | Item) + (0 + rcs(Position, 3) | Subject) + (0 + rcs(Position,
3) | Item), data = dat)
save(m8, file = "models/m8FzRCS.rda", compress = "xz")
#
dat <- dat0
m9 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) *
rcs(WMCc, 4) * Recalled + (1 | Subject) + (1 | Item) + (0 + rcs(WMCc,
4) | Item) + (0 + rcs(Position, 3) | Subject) + (0 + rcs(Position,
3) | Item) + (Recalled | Subject), data = dat)
dat <- romr.fnc(m9, dat, 2.5)$data
m9 <- lmer(Amplitude ~ rcs(Position, 3) * rcs(LengthBc, 3) *
rcs(WMCc, 4) * Recalled + (1 | Subject) + (1 | Item) + (0 + rcs(WMCc,
4) | Item) + (0 + rcs(Position, 3) | Subject) + (0 + rcs(Position,
3) | Item) + (Recalled | Subject), data = dat)
save(m9, file = "models/m9FzRCS.rda", compress = "xz")
#
anova(m8, m9)
# Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
# m8 87 355926 356699 -177876
# m9 90 367031 367832 -183426 0 3 1
#
# don't look at (Recalled | Item) because Recalled is property
# of items
#
# look at model diagnostic plots
png("figs/m8_mcp.png")
mcp.fnc(m8)
dev.off()
# good model criticism plots
#
# compare with model assuming linearity
dat <- dat0
m8l <- lmer(Amplitude ~ Position * LengthBc *
WMCc * Recalled + (1 | Subject) + (1 | Item) + (0 + WMCc
| Item) + (0 + Position | Subject) + (0 + Position
| Item), data = dat)
dat <- romr.fnc(m8l, dat, 2.5)$data
# n.removed = 776
# percent.removed = 1.426051
m8l <- lmer(Amplitude ~ Position * LengthBc *
WMCc * Recalled + (1 | Subject) + (1 | Item) + (0 + WMCc
| Item) + (0 + Position | Subject) + (0 + Position
| Item), data = dat)
save(m8l, file = "models/m8lFz.rda", compress = "xz")
#
anova(m8l, m8)
# Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
# m8l 22 356274 356470 -178115
# m8 87 355926 356699 -177876 478.4 65 < 2.2e-16 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
### nonlinear much more likely
# plot the 4-way interaction
library(rms)
library(fields)
data(vars)
dat <- dat0
dat <- dat[as.numeric(rownames(m8@frame)),]
n=100
wmcO<-seq(min(vars$WMC),max(vars$WMC),length.out=n)
wmc<-seq(min(dat$WMCc),max(dat$WMCc),length.out=n)
lengthBO<-seq(min(vars$LengthB),max(vars$LengthB),length.out=n)
lengthB<-seq(min(dat$LengthBc),max(dat$LengthBc),length.out=n)
positionO<-1:6
position<-sort(unique(dat$Position))
# multiply model matrix by coefficients to get predicted values
# without the random effects
dat$fitted<-m8@X%*%fixef(m8)
# put this into a model to use predictrms
# first restrict to recalled items
tmp<-dat[dat$Recalled=="correct",]
m.ols<-ols(fitted~rcs(WMCc,4)*rcs(Position,3)*rcs(LengthBc,3),data=tmp)
# plot for each position for correctly recalled
pdf("figs/AmplitudeRecalled.pdf")
par(mfrow=c(3,2),mar=c(4.1,4.1,3.1,1.1))
zlimit<-c(-7.5, 7.5)
m.cor.list<-list()
for(p in 1:length(position)){
# create newdata for Position == p
mygrid<-expand.grid(wmc,lengthB)
mygrid$Position<-position[p]
colnames(mygrid)<-c("WMCc","LengthBc","Position")
mygrid$Predicted<-predict(m.ols,mygrid)
if(p==1){
dat.pred.cor<-mygrid
}else{
dat.pred.cor<-rbind(dat.pred.cor,mygrid)
}
# reformat to use image
m.cor<-matrix(NA,ncol=length(wmc),nrow=length(lengthB))
for(i in 1:nrow(m.cor)){ # lengthB
for(j in 1:ncol(m.cor)){ # wmc
x<-mygrid[mygrid$LengthBc==lengthB[i]&mygrid$WMCc==wmc[j],"Predicted"]
if(length(x)>0){
m.cor[i,j]<-x
}
}
}
m.cor.list[[p]]<-m.cor
# plot
len<-vector("numeric")
for(k in 1:12){
len<-c(len,lengthB[which(lengthBO==k)])
}
image.plot(lengthBO,wmcO,m.cor,col=topo.colors(50),main=paste("Position",p),
xlab=ifelse(p%in%c(1,4,5),"Length (# letters)",""),ylab=ifelse(p%in%c(1,3,
5),"WMC (from 0.50 to 0.90)",""),zlim=zlimit,legend.args=list(text=
eval(parse(text = paste("expression(",paste("Amplitude~~(",
expression(mu * V),")",sep=""),")",sep=""))), line=1.5,cex=0.65))
contour(lengthBO,wmcO,m.cor,nlevels=15,add=TRUE)
}
save(m.cor.list,file="data/m.cor.list.rda")
save(dat.pred.cor,file="data/dat.pred.cor.rda")
par(mfrow=c(1,1),mar=c(5.1,4.1,4.1,2.1))
dev.off()
# now restrict to incorrect items
tmp.inc<-dat[dat$Recalled=="incorrect",]
m.ols.inc<-ols(fitted~rcs(WMCc,4)*rcs(Position,3)*rcs(LengthBc,3),
data=tmp.inc)
# plot for each position for correctly recalled
m.inc.list<-list()
pdf("figs/AmplitudeNOTRecalled.pdf")
par(mfrow=c(3,2),mar=c(4.1,4.1,3.1,1.1))
zlimit<-c(-7.5, 7.5)
for(p in 1:length(position)){
# create newdata for Position == p
mygrid<-expand.grid(wmc,lengthB)
mygrid$Position<-position[p]
colnames(mygrid)<-c("WMCc","LengthBc","Position")
mygrid$Predicted<-predict(m.ols.inc,mygrid)
if(p==1){
dat.pred.inc<-mygrid
}else{
dat.pred.inc<-rbind(dat.pred.inc,mygrid)
}
# reformat to use image
m.inc<-matrix(NA,ncol=length(wmc),nrow=length(lengthB))
for(i in 1:nrow(m.inc)){ # lengthB
for(j in 1:ncol(m.inc)){ # wmc
x<-mygrid[mygrid$LengthBc==lengthB[i]&mygrid$WMCc==wmc[j],"Predicted"]
if(length(x)>0){
m.inc[i,j]<-x
}
}
}
m.inc.list[[p]]<-m.inc
# plot
len<-vector("numeric")
for(k in 1:12){
len<-c(len,lengthB[which(lengthBO==k)])
}
image.plot(lengthBO,wmcO,m.inc,col=topo.colors(50),main=paste("Position",p),
xlab=ifelse(p%in%c(1,4,5),"Length (# letters)",""),ylab=ifelse(p%in%c(1,3,
5),"WMC (from 0.50 to 0.90)",""),zlim=zlimit,legend.args=list(text=
eval(parse(text = paste("expression(",paste("Amplitude~~(",
expression(mu * V),")",sep=""),")",sep=""))), line=1.5,cex=0.65))
contour(lengthBO,wmcO,m.inc,nlevels=15,add=TRUE)
}
save(m.inc.list,file="data/m.inc.list.rda")
save(dat.pred.inc,file="data/dat.pred.inc.rda")
par(mfrow=c(1,1),mar=c(5.1,4.1,4.1,2.1))
dev.off()
#
# Now look at difference plots
pdf("figs/AmplitudeRecalledDifference.pdf")
par(mfrow=c(3,2),mar=c(4.1,4.1,3.1,1.1))
zlimit<-c(-7.5, 7.5)
for(p in 1:length(position)){
m.diff<-m.inc.list[[p]]-m.cor.list[[p]]
# plot
len<-vector("numeric")
for(k in 1:12){
len<-c(len,lengthB[which(lengthBO==k)])
}
image.plot(lengthBO,wmcO,m.diff,col=topo.colors(50),main=paste("Position",p),
xlab=ifelse(p%in%c(1,4,5),"Length (# letters)",""),ylab=ifelse(p%in%c(1,3,
5),"WMC (from 0.50 to 0.90)",""),zlim=zlimit,legend.args=list(text=
eval(parse(text = paste("expression(",paste("Amplitude~~(",
expression(mu * V),")",sep=""),")",sep=""))), line=1.5,cex=0.65))
contour(lengthBO,wmcO,m.diff,nlevels=15,add=TRUE)
}
par(mfrow=c(1,1),mar=c(5.1,4.1,4.1,2.1))
dev.off()
#
# model with logit
m.logit<-lmer(Amplitude~rcs(WMCc,4)*rcs(Positionc,3)*rcs(LengthBc,3)*
rcs(logit,3)+(1|Subject)+(1|Item)+(0+rcs(Position,3)|Subject)+
(0+rcs(Position,3)|Item)+(rcs(logit,3)|Subject),data=dat)
save(m.logit,file="models/m.logit.rda", compress = "xz")
# won't look at that here. A bit too complex for now.
#
@
}
}
The best fitting model had the form {\it Amplitude as a function of rcs(Position, 3) $\times$ rcs(Length, 3) $\times$ rcs(WMC, 4) $\times$ Recalled}. It was much more likely than a model with the same fixed- and random-effect structure, but for which linearity was assumed (model not assuming linearity: AIC = 355926, {\it df} = 87; model assuming linearity: AIC = 356274, {\it df} = 22; $\chi^2$ = 478.4, {\it df} = 65, {\it p} $< 0.001$). The four-way interaction was significant ($F(12, 50970) = 2.3$, $p = 0.0061$), thus answering ``yes'' to both of the questions posed above. To visualize this four-way interaction, and considering that the fourth factor, Recalled, was dichotomous, we can plot three-way interaction plots as we did for the previous model, but separately for correctly- and incorrectly-recalled chunks. These appear as Figures \ref{fig:Recalled} and \ref{fig:NotRecalled}, respectively. As a means of data reduction, we have also plotted the difference between these two maps in Figure \ref{fig:Difference}.
\begin{figure}[h]
\centering
\includegraphics{figs/AmplitudeRecalled}
\caption{The three way interaction {\it Position $\times$ Length $\times$ WMC} for correctly recalled sequences. Details are as for Figure \ref{fig:m6Plot}.}
\label{fig:Recalled}
\end{figure}
\begin{figure}[h]
\centering
\includegraphics{figs/AmplitudeNOTRecalled}
\caption{The three way interaction {\it Position $\times$ Length $\times$ WMC} for incorrectly recalled sequences. Details are as for Figure \ref{fig:m6Plot}.}
\label{fig:NotRecalled}
\end{figure}
\begin{figure}[h]
\centering
\includegraphics{figs/AmplitudeRecalledDifference}
\caption{The four way interaction {\it Position $\times$ Length $\times$ WMC $\times$ Recalled} -- incorrectly minus correctly recalled. Details are as for Figure \ref{fig:m6Plot}.}
\label{fig:Difference}
\end{figure}
Keeping in mind that the plots of the data separated by Recalled are each based on less data than those collapsed across levels of this variable, and are thus likely less robust (again, especially at extreme values of WMC), we can see similar patterns as in Figure \ref{fig:m6Plot}, such as nonlinearly increasing amplitude with Length and at higher levels of WMC particularly at later Positions. However, overall we can observe more negative amplitudes for recalled chunks in Figure \ref{fig:Recalled} than for non-recalled chunks in Figure \ref{fig:NotRecalled} --- in particular for people with lower levels of WMC and at lower lengths. This is largely consistent with the behavioural results, where it was found that four-word sequences for which the second word was shorter were more often successfully recalled and that participant with a higher working memory capacity had a greater likelihood of correctly recalling a sequence. Additionally, while a primacy effect was noted in the previous analysis, when we look exclusively at correctly-recalled items or at the difference plot in Figure \ref{fig:Difference} we can see a pattern that seems to correspond to the ``recency'' effect as well, whereby N1a amplitudes were particularly large for correctly-recalled items at Position 6, again showing the nonlinear increase with Length. Note that this recency effect is consistent with the one found in the behavioural analysis. Overall, these data suggest that while attentional resources (as indexed by the N1a) are not a necessary requirement for successful recall, these resources seem to be allocated preferentially to more difficult items and to a greater degree in people with more limited WMC, and such allocation correlates with later successful recall.
}
}
}
{\section{Discussion}
Relaxing the assumption of linearity enabled us to gain a better understanding of the data by capturing important structure that was overlooked in the model for which linearity was assumed. Indeed, only a main effect of Position was uncovered in this model whereas the one for which linearity was not assumed revealed a significant main effect of Position in addition to significant {\it Position $\times$ WMC} and {\it Position $\times$ Length $\times$ WMC} interactions. Moreover, even if these interactions would have reached significance in the model assuming linearity, the association strength between N1a amplitudes and these interactions (as indexed by the percentage of deviance explained) would have been much lower than in the one for which linearity was not assumed, as can be seen in Table \ref{tab:dev.expl}. While the {\it Position $\times$ WMC} interaction in the model for which linearity was not assumed accounted for 4 times more deviance than in the model for which linearity was assumed, the {\it Position $\times$ Length $\times$ WMC} in the former model accounted for 26 times more deviance than same interaction in the latter model.
{% R code: deviance explained
<>=
load("smry/dev.expl.tab.rda")
print(dev.expl.tab)
@
}
{% latex table generated in R 3.0.2 by xtable 1.7-1 package
% Mon Nov 4 11:22:45 2013
\begin{table}[ht]
\centering
\begin{tabular}{rcc}
\hline
& Assuming Linearity & Not Assuming Linearity \\
\hline
Position & 0.0255\% & 0.0211\% \\
Length & 0.0024\% & 0.0013\% \\
WMC & 0.0002\% & 0.0011\% \\
Position $\times$ Length & 0.0005\% & 0.0108\% \\
{\bf Position $\times$ WMC} & {\bf 0.0038\%} & {\bf 0.0154\%} \\
Length $\times$ WMC & 0.0046\% & 0.0125\% \\
{\bf Position $\times$ Length $\times$ WMC} & {\bf 0.0018\%} & {\bf 0.0473\%} \\
\hline
\end{tabular}
\caption{Percentage of the deviance explained by each term of the model assuming linearity and the one not assuming linearity. The interactions in bold face correspond to ones that were significant in the model for which linearity was not assumed.}
\label{tab:dev.expl}
\end{table}
}
Finally, we were able to push our understanding of the data even further by allowing the {\it Length $\times$ WMC} smooth surfaces at each Position to vary according to whether a four-word sequence was correctly or incorrectly recalled. This four-way interaction was significant even after accounting for subject- and item-wise random intercepts and restricted cubic splines for Position and WMC.
In spite of the clear potential advantage of nonlinear analyses in capturing the true structure of the data, it is important to recognize limitations of this technique. Firstly, the relationship between two variables may in fact be best described by a straight line. In this case, nonlinear analysis is unnecessary; this will be recognized however if model comparison testing is used to determine whether nonlinear fitting is warranted. Secondly, with complexity of analytical model comes complexity in interpretation. In the present case we have demonstrated that this complexity is not necessarily uninterpretable however, and indeed the results provide a much richer understanding of the data than could be provided by linear assumptions. A third issue, that we have not detailed thus far, concerns what sort of linear model to fit and what assumptions to make in this regard.
A crucial step in modelling nonlinear relationships between two variables is parametrizing the splines --- determining how many knots they will have. The approach we used here was mainly based on visual inspection of averages and lowess smooths. A drawback with this procedure is that it is essentially arbitrary and open to investigator bias. A more data-driven method to determine the correct parametrization of the splines would be to fit a series of successively more complex models (i.e., with different numbers of knots) and select the one with the lowest AIC. If there is only one predictor variable, this may be done relatively easily. However, the space of possible models increases drastically as the number of predictors in the model increases. For instance, if we set the upper bound on the number of knots to 5 for each of three predictors, we would have to fit 125 different models; for a maximum of 10 knots we would require comparisons among 1000 model fits.
An alternative, data-driven approach is to avoid the necessity of performing such a search by using semi- or non-parametric methods such as generalized additive mo\-del\-ling (GAM) and generalized additive mixed-effects mo\-del\-ling (GAMM). The GA(M)M algorithm as implemented in package {\tt mgcv} \cite{Wood2012} strikes a balance between under- and over-smoothing the data by using a fixed number of knots and using penalized regression where the least squares objective is augmented by a ``wiggliness'' penalty. The value of this wiggliness penalty is determined, for example, by way of generalized cross validation (GCV). Crucially, the user only needs to specify an upper-bound on the number of knots to be used and GA(M)M determines how many knots it actually needs to model the data. That is, the exact number of knots passed to the cubic regression spline function is not generally critical in this framework. The only concern is to provide a large enough upper-bound so as to enable GA(M)M to represent the underlying ``truth'' reasonably well. Although we do not explore the modelling of ERP data using GA(M)M here, an example of how it performs was provided in Figure \ref{fig:example0}. Indeed, each one of the models for which linearity was not assumed (the right column) were fitted using GAM. The upper-bound on the number of knots was set to 10. In panel B, GAM determined that the number of knots to be used was equal to 0. In panel D, it settled on a linear relationship between X and Y. While it deemed that 3 knots were necessary to model the simulated quadratic curve in panel F, it determined that 7 knots were required to capture the nonlinear relationship in panel H. We refer the interested reader to \citeA{Wood2006}, \citeA{Faraway2005}, \citeA{Keele2008}, and \citeA{Zuur2009} for more details on generalized additive (mixed-effects) modelling.
}
{\section{Conclusion}
In order to deal with the complexity of ERP data, it is very common to assume that the relationship between ERP amplitudes/latencies and independent variables is linear. This assumption does not always hold, however. Although in some ERP studies the assumption of linearity has been relaxed, these studies have typically employed the common practice of discretizing continuous variables, which has been shown elsewhere to be suboptimal.
In this paper, we have demonstrated how nonlinearities in ERP data can be modeled using restricted cubic splines and mixed-effects regression in {\tt R}. This revealed that the assumption of linearity may lead one to draw incorrect inferences from his/her ERP data. More specifically, we have illustrated how the assumption of linearity may (i) increase the probability of false negatives, (ii) if a relationship is obtained, provide a limited or incorrect representation of the true relationship between variables, and (iii) under-estimate the strength of the association between variables.
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\bibliography{references}
\end{document}