\documentclass[doc,apacite]{apa6}
\usepackage{Sweave}
\usepackage{animate}
\usepackage{graphicx}
\usepackage{rotating}
\usepackage{verbatim}
\usepackage{enumerate}
\usepackage{amssymb}
\usepackage[english]{babel}
\usepackage[utf8]{inputenc}
\usepackage{floatrow}
\floatsetup[table]{capposition=top}
\usepackage{dcolumn}
\newcolumntype{d}[1]{D{ }{\cdot}{#1}} %the argument for d specifies the maximum number of decimal places
% some commands for avoiding floats flushed to the end
\renewcommand{\topfraction}{0.9} % max fraction of floats at top
\renewcommand{\bottomfraction}{0.8} % max fraction of floats at bottom
\setcounter{topnumber}{2}
\setcounter{bottomnumber}{2}
\setcounter{totalnumber}{4} % 2 may work better
\renewcommand{\textfraction}{0.07} % allow minimal text w. figs
\renewcommand{\floatpagefraction}{0.7} % require fuller float pages
\usepackage{minted}
%\renewenvironment{Sinput}{\minted[frame=single]{r}}{\endminted}
\renewenvironment{Sinput}{\minted{r}}{\endminted}
\DefineVerbatimEnvironment{Soutput}{Verbatim}{frame=leftline}
\DefineVerbatimEnvironment{Scode}{Verbatim}{}
\title{Modelling Non-linear Relationships in ERP Data Using Mixed-effects Regression with R Examples}
%\VignetteIndexEntry{Modelling Non-linear Relationships in ERP Data Using Mixed-effects Regression with R Examples}
\shorttitle{Non-linear Relationships in ERP Data.}
\twoauthors{Antoine Tremblay}{Aaron J. Newman}
\twoaffiliations{Dalhousie University, Halifax, Nova Scotia, Canada}{Dalhousie University, Halifax, Nova Scotia, Canada}
\leftheader{Tremblay \& Newman}
\abstract{
In the analysis of psychological and psychophysiological data, the relationship
between two variables is often assumed to be a straight line. This may be due
to the prevalence of the general linear model in data analysis in these fields,
which makes this assumption implicitly. However, there are many problems for
which this assumption does not hold. In this paper, we show that in the
analysis of \textcolor{red}{event-related} potential (ERP) data, the assumption
of linearity comes at a cost and may significantly affect the inferences drawn
from the data. We demonstrate why the assumption of linearity should be relaxed
and how to model nonlinear relationships between ERP amplitudes and predictor
variables within the familiar framework of generalized linear models, using
\textcolor{red}{regression} splines and mixed-effects
\textcolor{red}{modelling}.
\ \ \ \ \ \ \ \ \ \ This paper has been written in \LaTeX{} using
\texttt{Sweave} and \texttt{R}, and the source document is provided as
supplementary material. The data, a \texttt{pdf} of this paper, the
\texttt{.Rnw} file used to write it, and the \texttt{R} code used to generate
all of the analyses, tables, and figures presented here are available in
package \texttt{nlEEG}. The package is available as supplementary material for
this article, and also from \texttt{http://hdl.handle.net/10222/22146}.
\textcolor{red}{An additional set of supplementary material entitled {\it
What's Under the Hood: A Brief Introduction to GAMM}, is available from the
same URL.} These resources can be used to work through the examples, and
potentially act as a starting point for the reader's own forays into
mixed-effects \textcolor{red}{modelling}.
}
\keywords{EEG; ERP; nonlinear relationships; GAMM; regression splines.}
\authornote{Corresponding author: Antoine Tremblay, NeuroCognitive Imaging Laboratory, Dalhousie University, Halifax, NS B3H 4R2, Canada. Tel.: (902) 494-1911. A.T. was supported during data collection by a Social Sciences and Humanities Research Council of Canada (SSHRC) doctoral fellowship while in the Department of Linguistics at the University of Alberta, Edmonton, Canada (2007--2009). A.T. was supported during analysis and writing by a SSHRC post-doctoral fellowship in the Department of Psychology and Neuroscience at Dalhousie University, Halifax, Canada (2011--2013). A.J.N. was supported by a SSHRC Standard Research Grant.}
\begin{document}
\maketitle
% to generate pdf:
% R CMD Sweave eeg_non_linear_v7.Rnw
% pdflatex --shell-escape eeg_non_linear_v7.tex
% bibtex eeg_non_linear_v7
% pdflatex --shell-escape eeg_non_linear_v7.tex
% pdflatex --shell-escape eeg_non_linear_v7.tex
{{\section{Introduction}
{{% first part of intro
Event-related potential (ERP) datasets are commonly collected in cognitive
neuroscience experiments because they offer rich spatio-temporal information
about brain activity during perception, cognition, and action. The potential
power of ERP data comes with the cost that the datasets tend to be large and
quite complex. Thus the richness of the data demands analytical techniques that
are appropriate to fully describe both the complexity of the data, and control
for factors that might interfere with such analyses. In trying to deal with
such complex data, it is very common to assume that the relationship between
ERP amplitudes/latencies and an independent variable is linear. For example, in
a study that investigates the relationship between age and amplitude of the N1
ERP component, one would assume that a given increase in age is associated with
a consistent increase (or decrease) in the amplitude and/or latency of the N1,
regardless of whether the increase is from 23 to 24 years old or from 69 to 70
years old.
In reality, there are many problems for which the answer is not a straight line
\cite{Rupert2003, Pinheiro2000, Wood2006, WuZhang2006, Hastie1990,
BaayenKuperman2010, TremblayBaayen2010, Kryuchkova2012} and ERPs are no
exception. For example, \citeA{Carbon2005} investigated the effect of
``Thatcherized'' faces (in which the eyes and mouth regions are turned
upside-down) on N170 amplitudes (a negative component occurring between 130 and
200 ms at occipito-temporal scalp sites). Pictures of normal and Thatcherized
faces were presented at either 0, 90, or 180$^{\circ}$ rotation; only when
presented upright are these perceived as severely distorted. \citeA{Carbon2005}
found a nonlinear effect of rotation on N170 amplitudes in response to faces:
Upright Thatcherized faces elicited a smaller N170 than Thatcherized faces
rotated 90 or 180$^{\circ}$. The effect on N170 amplitudes of the latter two
orientations, however, did not differ. \citeA{Boutheina2009} also investigated
the effects of picture rotation on N170 amplitudes. They used pictures of
normal faces that were rotated 0, 22.5, 45, 67.5, 90, 112.5, 157.5, or
180$^{\circ}$ and found a relatively complex nonlinear effect on N170
amplitudes and latencies \cite[]{Boutheina2009}.
Numerous other examples of nonlinear effects on ERPs can be found in the
literature, in areas such as development \cite{Webb2005}, rhythm perception
\cite{PablosMartin2007}, and auditory processing \cite{Inui2010}.
The studies mentioned above did not assume linearity, which is a step forward.
However, in keeping with the traditional ANOVA approach to statistical
analysis, they tested for nonlinear relationships using the common practice of
discretizing the variables of interest. That is, they have reduced inherently
continuous measures, such as rotation angle (or stimulus duration,
interstimulus interval, magnitude, frequency, etc.) to factor variables with
only a few levels. Such a practice is likely attributable to the natural
development of cognitive neuroscience from behavioural approaches to studying
cognition that traditionally rely heavily on ANOVA models. However, it comes
with problems of its own. Indeed, not only does discretization induce bias into
the analysis, but it also decreases statistical power and increases the
probability of finding spurious associations \cite{Cohen:1983, McCallum2002}.
Fortunately, such factorization is neither desirable, nor necessary to detect
nonlinear relationships.\footnote{\textcolor{red}{Although we advocate here
that continuous variables should not be discretized, there are circumstances in
which such variables should be divided into categories. For example if the
values of a variable happen to be distributed in such a way that there are two
distinct peaks separated by a large gap, then it would make sense to
dichotomize it. It goes without saying that an inherently binomial variable
should be entered in a model as a factor variable.}}
}}
{{\subsection{Relaxing the assumption of linearity: A set of simple examples}
Assuming linearity when it does not hold may lead one (i) to over-look
important structure in the data, (ii) to conclude that there is no relationship
between two variables when in fact there is one, and (iii) to under-estimate
the strength of the relationship between a dependent and an independent
variable \cite{Sahai1997}. To illustrate these points, we simulated four data
sets \cite[]{Wood2012, R300} and compare models that assume linearity to ones that
do not (for the sake of this example, we will gloss over how the assumption of
linearity was relaxed for the latter analyses). In Figure \ref{fig:example0},
the panels on the left show the models that were fitted assuming linearity,
whereas the ones on the right show the models that were fitted without making
this assumption. In the first simulated data set, shown in the top panels (A
and B), the dependent (Y) variable does not correlate with the predictor
variable, and neither of the two models report that it does. The 2 point
difference in \textcolor{red}{Akaike's Information Criterion
\cite[]{Akaike1973}} indicates that both models are
equivalent.\footnote{\textcolor{red}{AIC is a relative measure that indexes how
likely a model is given the data \textcolor{red}{(relative to other models)}.
It is calculated as {\it 2 $\times$ the number of parameters in the model - 2
$\times$ the log likelihood of the model} (i.e., more complex models will tend
to have higer AIC values). Statistical models can be compared by way of AIC,
where those with} smaller values reflect more likely models. By convention, we
consider that two models differ if the difference in AIC value is equal to or
greater than 5, meaning that the model with the lower AIC is 12 times more
likely \textcolor{red}{given the data} than the one with the higher AIC.
\textcolor{red}{We refer the interested reader to \citeA{Symonds2011} and
\citeA{Burnham2011} for more details about AIC and model selection.}}
Panels C and D show data in which there is a linear relationship between the
response and predictor variables, which is captured by both linear and
nonlinear models. Indeed, the nonlinear model fit results in a straight line
since there is no nonlinearity in the relationship, and the AIC values
indicates that both models are just as likely given the data. The simulated
relationship between the dependent and independent variables in panels E and F
is quadratic. The model for which linearity is assumed (panel E) not only
fails to capture the quadratic relationship, but it also falsely concludes that
there is no significant relationship between X and Y. The model shown in panel
F, however, is able to capture the quadratic trend and comparison of the AIC
values for the two models confirms that this is a better fit. In panels G and
H, the simulated relationship is more complex, taking on a ``wavy'' function.
Although the model for which linearity is assumed does find that there is a
significant relationship between X and Y, the actual nature of the relationship
is only part of the underlying truth. This leads the model to under-estimate
the strength of the relationship between X and Y, and the accuracy of the
prediction varies as a function of X. Conversely, the model for which linearity
is not assumed, illustrated in panel H, is able to capture the true nature of
the relationship, and correspondingly has a much lower AIC value.
{{% Rcode: load function plotGamTweaked
{\footnotesize
<>=
library(mgcv)
options(warn=-1)
plotGamTweaked<-function(x,residuals=FALSE,se=TRUE,
select=1,n=200,n2=n,seWithMean=FALSE,...){
i<-select
w.resid<-NULL
m<-length(x$smooth)
order<-attr(x$pterms,"order")
if(is.numeric(se)){
if(se<1){
se<-FALSE
}else{
se<-TRUE
}
}
if(se&&x$Vp[1,1]<=0){
se<-FALSE
warning("No variance estimates available")
}
if(residuals){
fv.terms<-predict(x,type="terms")
if (is.null(w.resid)){
w.resid<-x$residuals*sqrt(x$weights)
}
}
if(m>0){
if(x$smooth[[i]]$dim==1){
raw<-x$model[x$smooth[[i]]$term]
xx<-seq(min(raw),max(raw),length = n)
if(x$smooth[[i]]$by!="NA"){
by<-rep(1,n)
dat<-data.frame(x=xx,by=by)
names(dat)<-c(x$smooth[[i]]$term,x$smooth[[i]]$by)
}else{
dat<-data.frame(x=xx)
names(dat)<-x$smooth[[i]]$term
}
X<-PredictMat(x$smooth[[i]],dat)
first<-x$smooth[[i]]$first.para
last<-x$smooth[[i]]$last.para
p<-x$coefficients[first:last]
offset<-attr(X,"offset")
if(is.null(offset)){
fit<-X%*%p
}else{
fit<-X%*%p+offset
}
if(se){
if(seWithMean&&attr(x$smooth[[i]],"nCons")>0){
X1<-matrix(x$cmX,nrow(X),ncol(x$Vp),
byrow=TRUE)
meanL1<-x$smooth[[i]]$meanL1
if(!is.null(meanL1)){
X1<-X1/meanL1
}
X1[,first:last]<-X
se.fit<-sqrt(rowSums((X1%*%x$Vp)*X1))
}else{
se.fit<-sqrt(rowSums((X%*%x$Vp[first:last,
first:last])*X))
}
}
edf<-sum(x$edf[first:last])
xterm<-x$smooth[[i]]$term
pd.item<-list(fit=fit,dim=1,x=xx,raw=raw[[1]])
if(residuals){
pd.item$p.resid<-fv.terms[,length(order)+i]+w.resid
}
if(se)pd.item$se=se.fit
}else if(x$smooth[[i]]$dim==2){
xterm<-x$smooth[[i]]$term[1]
yterm<-x$smooth[[i]]$term[2]
raw<-data.frame(x=x$model[xterm][[1]],
y=x$model[yterm][[1]])
n2<-max(10,n2)
xm<-seq(min(raw$x),max(raw$x),length=n2)
ym<-seq(min(raw$y),max(raw$y),length=n2)
xx<-rep(xm,n2)
yy<-rep(ym,rep(n2,n2))
if(x$smooth[[i]]$by!="NA"){
by<-rep(1,n)
dat<-data.frame(x=xx,y=yy,by=by)
names(dat)<-c(xterm,yterm,x$smooth[[i]]$by)
}else{
dat<-data.frame(x=xx,y=yy)
names(dat)<-c(xterm,yterm)
}
X<-PredictMat(x$smooth[[i]],dat)
first<-x$smooth[[i]]$first.para
last<-x$smooth[[i]]$last.para
p<-x$coefficients[first:last]
offset<-attr(X,"offset")
if(is.null(offset)){
fit<-X%*%p
}else{
fit<-X%*%p+offset
}
if(se){
if(seWithMean&&attr(x$smooth[[i]],"nCons")>0){
X1<-matrix(x$cmX,nrow(X),ncol(x$Vp),
byrow=TRUE)
meanL1<-x$smooth[[i]]$meanL1
if(!is.null(meanL1)){
X1<-X1/meanL1
}
X1[,first:last]<-X
se.fit<-sqrt(rowSums((X1%*%x$Vp)*X1))
}else{
se.fit<-sqrt(rowSums((X%*%x$Vp[first:last,
first:last])*X))
}
}
edf<-sum(x$edf[first:last])
pd.item<-list(fit=fit,dim=2,xm=xm,ym=ym,raw=raw)
if(se){
pd.item$se=se.fit
}
if(residuals){
pd.item$p.resid<-fv.terms[,length(order)+i]+w.resid
}
}else{
pd.item<-list(dim=x$smooth[[i]]$dim)
}
}
if("xm"%in%names(pd.item)){
res<-list(xm=pd.item$xm,ym=pd.item$ym,mat=pd.item$fit,n=n)
# rows are xm and columns are ym
# rownames(mat)<-res$xm
# colnames(mat)<-res$ym
}else{
res<-list(x=pd.item$x,y=pd.item$fit,n=n)
}
if(se){
res$se=pd.item$se
}
if(residuals)res$partial.resids<-pd.item$p.resid
return(invisible(res))
}
@
}
}}
{{% Rcode: load function simOsc
{\footnotesize
<>=
simOsc = function(
n=60, # grid size
Xdamped=0, # whether wave in X dimension should be damped
XdampedDegree=0.02, # degree of damping in X dimension
Xdomain=c(-3,3), # range for X
Xoffset = 0, # shift for horizontal dimension
Ydamped=0, # whether wave in Y dimension should be damped
YdampedDegree=0.02, # degree of damping in Y dimension
Ydomain=c(1.2,7.2), # range for Y
Yoffset = 0, # shift for vertical dimension
mirror = 1, # mirror in X-Y plane if mirror=-1
plot=TRUE # show plot of surface
) {
# the way the matrix is filled is crazy, fits with the requirements of image()
t = sin(2*(seq(Xdomain[1],Xdomain[2],length=n)+Xoffset))*exp(-XdampedDegree*(n:1)*Xdamped)
f = sin(seq(Ydomain[1], Ydomain[2],length=n)+Yoffset)*exp(-YdampedDegree*(1:n)*Ydamped)
m=matrix(0,n,n)
for (i in 1:n) {
m[i,] = t[i]*f
}
m = m*mirror
if (plot) {
image(m, col=topo.colors(30),axes=FALSE, xlab="Time", ylab="Predictor")
contour(m, add=TRUE)
box()
}
return(t(m)[n:1,]) # in normal matrix form
}
@
}
}}
{{% Rcode: linear vs nonlinear
{\footnotesize
<>=
pdf(file="figs/fig-linearVSnonlinear.pdf",height=10,width=10)
par(mfrow = c(4, 2))
ylimit <- c(-2.5, 20)
#
# generate noise
#dat <- data.frame(x = 1:100, y = runif(100,min = -10, max = 10)) + 7.5
#save(dat,file="data/simAB.rda",compress="xz")
load("data/simAB.rda")
m <- gam(y ~ s(x, bs = "cs"), data = dat)
m.lm <- lm(y ~ x, data = dat)
myaic<-c(AIC(m),AIC(m.lm))
plot.dat <- plotGamTweaked(m, select = 1, n = nrow(dat))
# lm model 0
plot(dat$x, dat$y, ylim = ylimit, xlab = "variable X",
ylab = "variable Y", cex.lab = 1.5, cex.axis = 1.25)
title(main = "(A)", cex.main = 1.5, line = 3)
title(main = paste("p-value =", round(summary(m.lm)$coefficients[2, 4], 2),
sep=" "), cex.main = 1.5, line = 1.7)
title(main = paste("R-squared = ", round(summary(m.lm)$r.squared,
2), "; AIC = ", round(myaic[2]), sep = ""), cex.main = 1.5, line = 0.4)
abline(m.lm, col = "red", lwd = 2.5)
# gam model 0
plot(dat$x, dat$y, ylim = ylimit, xlab = "", ylab = "", cex.axis = 1.25)
title(main = "(B)", cex.main = 1.5, line = 3)
title(main = paste("p-value =",round(summary(m)$s.table[1,4],2), sep=" "),
cex.main = 1.5, line = 1.7)
title(main = paste("R-squared = ", round(summary(m)$r.sq, 2),
"; AIC = ", round(myaic[1]), sep = ""), cex.main = 1.5, line = 0.4)
lines(plot.dat$x, plot.dat$y + summary(m)$p.table["(Intercept)", "Estimate"],
col = "blue", lwd = 2.5)
#
# generate simulated data
load("data/simCD.rda")
#dat <- gamSim()
#save(dat,file="data/simCD.rda",compress="xz")
m <- gam(y ~ s(x1, bs = "cs"), data = dat)
m.lm <- lm(y ~ x1, data = dat)
myaic<-c(AIC(m),AIC(m.lm))
plot.dat <- plotGamTweaked(m, select = 1, n = nrow(dat))
# lm model 1
plot(dat$x1, dat$y, ylim = ylimit, xlab = "", ylab = "", cex.axis = 1.25)
title(main = "(C)", cex.main = 1.5, line = 3)
title(main = "p-value < 0.001", cex.main = 1.5, line = 1.7)
title(main = paste("R-squared = ", round(summary(m.lm)$r.squared,
2), "; AIC = ", round(myaic[2]), sep = ""), cex.main = 1.5, line = 0.4)
abline(m.lm, col = "red", lwd = 2.5)
# gam model 1
plot(dat$x1, dat$y, ylim = ylimit, xlab = "", ylab = "", cex.axis = 1.25)
title(main = "(D)", cex.main = 1.5, line = 3)
title(main = "p-value < 0.001",
cex.main = 1.5, line = 1.7)
title(main = paste("R-squared = ", round(summary(m)$r.sq, 2),
"; AIC = ", round(myaic[1]), sep = ""), cex.main = 1.5, line = 0.4)
lines(plot.dat$x, plot.dat$y + summary(m)$p.table["(Intercept)", "Estimate"],
col = "blue", lwd = 2.5)
#
# generate simulated data
ylimit <- c(-5, 8.5)
#load("data/simEF.rda")
#dat <- gamSim(4)
#save(dat,file="data/simEF.rda",compress="xz")
#dat<-dat[dat$fac=="1",]
#dat2<-dat
#dat2$y<-jitter(dat2$y,amount=2)
#dat<-rbind(dat,dat2)
#dat[dat$x2>0.3&dat$x2<=0.7,"y"]<-dat[dat$x2>0.3&dat$x2<=0.7,"y"]*1.25
#dat[dat$x2<=0.2,"y"]<-dat[dat$x2<=0.2,"y"]/2
#dat[dat$x2<=0.2,"y"]<-jitter(dat[dat$x2<=0.2,"y"],amount=2)
#save(dat,file="data/simEF_v2.rda",compress="xz")
load("data/simEF_v2.rda")
m <- gam(y ~ s(x2, bs = "cs"), data = dat)
m.lm <- lm(y ~ x2, data = dat)
myaic<-c(AIC(m),AIC(m.lm))
plot.dat <- plotGamTweaked(m, select = 1, n = nrow(dat))
# lm model
plot(dat$x2, dat$y, ylim = ylimit, xlab = "", ylab = "", cex.axis = 1.25)
title(main="(E)", cex.main = 1.5, line = 3)
title(main = paste("p-value =", round(summary(m.lm)$coefficients[2,4], 2),
sep=" "), cex.main = 1.5, line = 1.7)
title(main = paste("R-squared = ", round(summary(m.lm)$r.squared,
2), "; AIC = ", round(myaic[2]), sep = ""), cex.main = 1.5, line = 0.4)
abline(m.lm, col = "red", lwd = 2.5)
# gam model
plot(dat$x2, dat$y, ylim = ylimit, xlab = "", ylab = "", cex.axis = 1.25)
title(main = "(F)", cex.main = 1.5, line = 3)
title(main = "p-value < 0.001", cex.main = 1.5, line = 1.7)
title(main = paste("R-squared = ", round(summary(m)$r.sq, 2),
"; AIC = ", round(myaic[1]), sep = ""), cex.main = 1.5, line = 0.4)
lines(plot.dat$x, plot.dat$y + summary(m)$p.table["(Intercept)", "Estimate"],
col = "blue", lwd = 2.5)
#
# generate simulated data
ylimit <- c(-2.5, 20)
load("data/simGH.rda")
#dat <- gamSim()
#save(dat,file="data/simGH.rda",compress="xz")
m <- gam(y ~ s(x2, bs = "cs"), data= dat)
m.lm <- lm(y ~ x2, data = dat)
myaic<-c(AIC(m),AIC(m.lm))
plot.dat <- plotGamTweaked(m, select = 1, n = nrow(dat))
# lm model 3
plot(dat$x2, dat$y, ylim = ylimit, xlab = "", ylab = "", cex.axis = 1.25)
title(main = "(G)", cex.main = 1.5, line = 3)
title(main = "p-value < 0.001", cex.main = 1.5, line = 1.7)
title(main = paste("R-squared = ", round(summary(m.lm)$r.squared,
2), "; AIC = ", round(myaic[2]), sep = ""), cex.main = 1.5, line = 0.4)
abline(m.lm, col = "red", lwd = 2.5)
# gam model 3
plot(dat$x2, dat$y, ylim = ylimit, xlab = "", ylab = "", cex.axis = 1.25)
title(main = "(H)", cex.main = 1.5, line = 3)
title(main = "p-value < 0.001", cex.main = 1.5, line = 1.7)
title(main = paste("R-squared = ", round(summary(m)$r.sq, 2),
"; AIC = ", round(myaic[1]), sep = ""), cex.main = 1.5, line = 0.4)
lines(plot.dat$x, plot.dat$y + summary(m)$p.table["(Intercept)", "Estimate"],
col = "blue", lwd = 2.5)
par(mfrow=c(1,1))
dev.off()
@
}
}}
{{% Figure: linear vs non-linear
\begin{figure}[h]
\centering
\includegraphics[width=0.9\textwidth,height=0.7\textheight]{figs/fig-linearVSnonlinear.pdf}
\caption{Assuming linearity (left column; red lines) versus not assuming linearity (right column; blue lines).}
\label{fig:example0}
\end{figure}
}}
}}
{{\subsection{Modelling Non-linear Relationships}
Testing for nonlinear relationships can be seen as a natural extension of
testing for linear relationships. Linear regression is familiar to virtually
anyone with a modicum of training in statistics, and allows one to test for a
linear relationship between two variables as shown in the left-hand columns of
Figure \ref{fig:example0}. Thus for example in the analysis of the effects of
image rotation in the \citeA{Carbon2005} study described above, the levels of
rotation could have been treated as values along a continuum of possible
rotation angles, rather than as categorical levels. The problem with this
approach, as demonstrated in the examples in Figure \ref{fig:example0}, is that
if the relationship is nonlinear then a linear fit will not be optimal. In
contrast, pairwise testing of different ``levels'' of rotation is able to
detect which (if any) pairs of levels differ. Certain relationships can also be
detected using a set of linear contrast weights, as in Helmert or polynomial
contrasts. However, this approach does not generalize easily to many levels of
a factor, and because each pairwise or contrast test assumes independence
correction for multiple comparisons should be applied --- potentially weakening
sensitivity to extant effects.
Non-linear relationships can however be modelled without discretization
\textcolor{red}{by applying a series of linear transformations to the predictor
variables. Regression splines, which are thought to be the ideal series of
transformations and the most computationally efficient \cite<>[p.
148]{Wood2006}, are curves made up of sections of polynomials joined together
at points termed ``knots'' so that they are continuous in value, as well as
first and second derivatives. The specific spline equation used in the model
fitting algorithm we use here (i.e., generalized additive mixed-effects
regression; see below) is from \citeA{Gu2002}:}
\begin{equation}
\label{eq:rspline}
\color{red}
R(x, k) = \frac{[(k - \frac{1}{2})^2 - \frac{1}{12}]
[(x - \frac{1}{2})^2 - \frac{1}{12}]}{4} - \frac{[(|x - k| -
\frac{1}{2})^4) - \frac{1}{2}(|x - k| - \frac{1}{2})^2 +
\frac{7}{240}]}{24}
\end{equation}
\noindent \textcolor{red}{where $x$ is a vector of covariate values and $k$ is a
knot location.}
The \textcolor{red}{spline} function merely serves to transform variables. The
actual modelling of the data will be performed with \textcolor{red}{generalized
additive mixed-effects modelling}, which we briefly describe below.
}}
{{\subsection{\textcolor{red}{Generalized Additive Mixed Effects Modelling}}
\textcolor{red}{Generalized additive mixed-effects modelling (GAMM) is a
relatively recent development in statistics \cite[]{Hastie1990, Gu1991,
Wood2006} and has recently been applied to ERP data in several published
papers, including \citeA{Tremblay2009}, \citeA{TremblayBaayen2010}, and
\citeA{Kryuchkova2012}.\footnote{\textcolor{red}{Generalized additive
(mixed-effect) modelling is a widely accepted data analysis method that has
been used in well over a thousand papers from a wide variety of domains of
inquiry. See section {\it Domains of Inquiry in which GAMM has been Used} in
supplementary material {\it What's Under the Hood: A Brief Introduction to
GAMM}.}} Mixed-effect modelling (e.g., GAMM) is a natural tool for modelling
repeated measures \cite{Lin1999, WuZhang2006, Wu2010}, including repeated
measurements in ERP data \cite{Bagiella2000, Vossen2011}. This type of model
captures the dependency between repeated measurements -- such as for example
within subjects as well as within stimuli -- by modelling the
variance-covariance matrix of the error term.} Details about mixed-effects
modelling, as well as its potential advantages over repeated measures ANOVA,
can be found in a number of recent papers and books \cite{Bagiella2000,
Pinheiro2000, Faraway2005, WuZhang2006, Gellman2007, Baayen:2008, Baayen2008,
Quene2008, Zuur2009, Wu2010, Vossen2011}.
\textcolor{red}{A GAMM is a non-parametric regression model with the general
additive structure:}
\begin{equation}
\color{red}
g(\mu_{ij}) = \mathbf{X_i^*}\theta + f_1(x_{1i})+f_2(x_{2i}) +
f_3(x_{3i},x_{4i}) + \ldots + \epsilon_{ij} + \mathbf{Z_{ij}\alpha_{i}},
\end{equation}
\\
\noindent \textcolor{red}{where $\mu_{ij} \equiv \mathbb{E}(Y_{ij})$, $Y_{ij}$
is a response variable (e.g., amplitude), $\mathbf{X_i^*}$ is a row in the
matrix for a strictly parametric model component, $\theta$ is the corresponding
parameter vector, the $f(x)$'s are smooth functions of the covariates,
$\epsilon_{ij}$ are $N(0,\sigma^2)$ measurement errors, and
$\mathbf{Z_{ij}\alpha_{i}}$ are $N(0,D(\sigma^2))$ random effects
\cite{Hastie1990, Faraway2005, Wood2006, Keele2008}. The basis we use here for
our smooth functions, $f(x)$, are the regression splines defined in Equation
(\ref{eq:rspline}).}
Like ANOVA, \textcolor{red}{GAMM} also fits within the generalized linear mixed
effects model (GLMM). ANOVA and \textcolor{red}{GAMM} differ in that (1)
\textcolor{red}{GAMM} can model more complex random-effects structures
\cite[]{Baayen2008, Quene2008}; (2)
\textcolor{red}{GAMM} is robust to violations of sphericity if the correct
random effect structure is used \cite{Bagiella2000, Baayen2008}, eliminating
the need to correct for this post hoc using methods that are known to be either
overly conservative (Greenhouse-Geisser) or liberal (Huynh-Feldt), (3)
\textcolor{red}{GAMM} enables one to appropriately model imbalanced data
\cite{Bagiella2000, Gellman2007, Baayen2008}, a common situation in ERP data,
\textcolor{red}{and (4) the model fitting objective is augmented by a
``wiggliness'' penalty enabling one to model non-linear relationships that do
not over- or under-fit the data in a manner that reduces subjectivity and
circularity \cite{Wood2006, Kriegeskorte2009}.}
To fully make use of these capabilities, data should not be a\-ve\-raged across
trials or any other variable. Rather, \textcolor{red}{a GAMM} should be
fitted on the un-a\-ve\-raged (``single trial'') data. Beyond allowing robust,
accurate estimates of variance, this practice has the added benefit of allowing
one to model nonlinear relationships between dependent and independent
variables in multi-dimensional space.
{{\subsubsection{\textcolor{red}{Striking a Balance Between Model Fit and Model
Smoothness}}
\textcolor{red}{By specifying the model in terms of smooth functions rather
than detailed parametric relationships it is possible to avoid otherwise
cumbersome and unwieldy models. The ``wiggliness'' of a smooth function (i.e.,
the trade off between over- and under-fitting the data) is controlled by the
integrated square of the second derivative of a covariate,
$\int_0^1[f''(x)]^2dx$, multiplied by $\lambda$, a smoothing parameter
\cite{Wood2006}. As $\lambda$ tends towards infinity, the estimate of $f(x)$
will become a straight line, whereas $\lambda = 0$ will result in an
un-penalized (wiggly) estimate. One way of (objectively) determining $\lambda$
is through generalized cross validation (GCV). As \citeA{Hastie1990} put it,
``it is a way to \textit{let the data show us the appropriate functional
form}'' of a smooth (p. 1; emphasis is theirs). The GCV objective is to
minimize the error between the model-predicted value of a missing data point
and its real value. In brief, the flexibility of a GAMM is determined by
$\lambda$ and GCV rather than by the number and location of spline knots
\cite{Wood2006}.}
}}
{{\subsubsection{\textcolor{red}{Number of Knots and Knot Locations}}
\textcolor{red}{By default, knot are located at a covariate's quantiles. Given
that model smoothness is determined by $\lambda$ and GCV, the exact number of
knots to use is not generally critical. Nevertheless, it should be large enough
that one is reasonably sure of having enough degrees of freedom to represent
the underlying ``truth'' reasonably well, but small enough to maintain
reasonable computational efficiency \cite<>[also see the help page for function
{\it choose.k} in R]{Wood2006}.}
}}
{{\subsubsection{\textcolor{red}{Probability Values for Smooth Terms}}
\textcolor{red}{To calculate a smooth's probability value, a Wald statistic, $T$ is first computed:}
\begin{equation}
\color{red}
T = f'V^*_f f
\end{equation}
\noindent \textcolor{red}{where $f$ is the vector of values of a smooth term,
and $V^*_f$ is the rank $r$ pseudo-inverse of the corresponding Bayesian
covariance matrix $V_f$, and $r$ is the estimated degrees of freedom ($edf$)
for the term \cite{Wood2013b}. $T$ is then compared to a $\chi^2$ distribution
with degrees of freedom equal to $edf$ for the term plus $\alpha$ (e.g., 0.05)
times the number of estimated smoothing parameters. }
}}
{{\subsubsection{\textcolor{red}{Confidence Intervals}}
\textcolor{red}{Smooths have Bayesian confidence intervals around them, which
are obtained by taking the quantiles from the posterior distribution of the
$f(x)$'s \cite{Marra2012}.}
}}
}}
{{\subsection{Goals of the Present Study}
The goal of this paper is to demonstrate how relaxing the assumption of
linearity can lead to better modelling of ERP data. We also demonstrate how to
model nonlinear relationships using \textcolor{red}{GAMM}. This paper has been
written in \LaTeX{} using \texttt{Sweave} and \texttt{R}, and the source
document is provided as supplementary material. The data, a \texttt{pdf} of
this paper, the \texttt{.Rnw} file used to write it, and the \texttt{R} code
used to generate all of the analyses, tables, and figures presented in this
paper are available in package \texttt{nlEEG}. The package is available
as supplementary material or from
\texttt{http://hdl.handle.net/10222/22146}.\footnote{\textcolor{red}{The {\tt
R} code chunks contained in this paper are also available as individual {\tt
.R} files in folder {\it nlEEG/vignettes/Rcode\_chunks}.}} \textcolor{red}{The
paper also has a second set of supplementary material entitled {\it What's
Under the Hood: A Brief Introduction to GAMM} also available from
\texttt{http://hdl.handle.net/10222/22146}.} These resources can be used to
work through the examples, and potentially act as a starting point for the
reader's own forays into \textcolor{red}{GAMM} analysis. This paper can,
however, be read and understood without viewing the accompanying \texttt{R}
code.
}}
}}
{{\section{Example ERP Data}
{{% intro to section
The ERP data used here was collected in the context of an immediate free recall
experiment described in \citeA{Tremblay2009} and \citeA{TremblayBaayen2010}.
The goal of this experiment was to investigate whether frequently used
four-word sequences, such as \textit{end of the year}, \textit{I don't really
know}, and \textit{at the same time}, may be (de)composed via the application
of compositional rules or stored and retrieved as wholes (in which case they
should show effects of the frequency of co-occurrence of the four-word
sequence). This was achieved by examining whether the frequency of such phrases
affected early ERP components such as the anterior N1 (N1a) and the posterior
P1. Both of these components peak approximately 110--150 ms after stimulus
presentation, and were previously found to be modulated in studies of single
words by linguistic variables including frequency and length
\cite{sereno:rayner:posner:1998, Assadollahi2003, HaukPulvermuller2004,
Hauk2006a, Murphy2006, Penolazzi2007}. \citeA{Tremblay2009} reported that
higher-frequency four-word sequences indeed elicited more negative N1 and less
positive P1 amplitudes than low-frequency sequences.
However, additional data were collected in this study that were not analyzed in
\citeA{Tremblay2009} or \citeA{TremblayBaayen2010}. In the present paper, we
investigate whether working memory capacity, the length of the second word of a
sequence, and the position in a list where a four-word sequence was presented
(i.e., whether it was presented first, second, \ldots, fifth, or sixth in a
block), and their interaction affected ERPs in the memory task.\footnote{The
second word of a sequence appeared, more often than not, right where the
fixation cross was presented (centre of the screen). Thus, the second word of a
sequence was the first word that participants saw. It stands to reason that the
longer the second word of a sequence, the greater the amplitude of the N1
should be given that longer words elicit larger anterior negativities
\cite{Murphy2006}.}
In the analyses presented here we focused on the N1a as the ERP component of
interest. The N1a is known to be sensitive to spatial attention as well as
lexical frequency, probability of occurrence, and word length \cite[and
references cited therein]{Luck2005, Murphy2006, Penolazzi2007}. For instance,
high-frequency words elicit greater N1a amplitudes than low-frequency words,
and longer words elicit larger N1a amplitudes than shorter ones. Because in
this paper our focus is on demonstrating nonlinear analysis and comparing it to
a linear approach, we limited our analyses to a single electrode (Fz, located
along the midline, midway between the nose and the vertex of the head) where
the N1a component was maximal, and we do not discuss the implications of our
results with respect to the cognitive or linguistic literature.
\textcolor{red}{Nevertheless, we do provide the R code to perform a GAMM
analysis of the whole scalp, as well as results, in section {\it GAMM Analysis
of Entire Scalp in 80--180 ms Window} in supplementary material {\it What's
Under the Hood: A Brief Introduction to GAMM}.}
}}
{{\subsection{Participants}
Ten right-handed female students from the University of Alberta were paid for
their participation in the experiment. (Mean age = 23.4; \emph{SD} = 1.6;
min/max = 22/27). All were healthy native speakers of English, had normal or
corrected-to-normal vision, and did not report any neurological deficits.
Participants had a mean handedness score \cite{oldfield:1971} of 79.5/100
(\textit{SD} = 15.8). We assessed participants' reading span and working memory
capacity (henceforth WMC) using an adaptation of the \citeA{Daneman1980} test
presented on a PC using E-Prime (Mean WMC score = 73/100; \textit{SD} = 10.4).
}}
{{\subsection{Study design}
The stimulus list consisted of 432 four-word sequences taken from the
\emph{British National Corpus} \cite{davies:2004}. Some examples are
\textit{end of the year}, \textit{I don't really know}, \textit{at the same
time}, \textit{I have to say}, \textit{it would be a}, \textit{at the age of},
\textit{this is not a}, \textit{we've got to get}, and \textit{I think it's
the}. Frequencies, which were obtained from the \emph{Variations in English
Words and Phrases} search engine \cite{fletcher:2008}, ranged from 0.03 to 105
occurrences per million. The stimuli were divided into 72 blocks. Each block
was divided into 18 trials, where in each trial six four-word sequences were
randomly presented one at a time in the middle of the screen with an
inter-stimulus interval of roughly 4000 ms. Sequences subtended on a\-ve\-rage
$\sim5^{\circ}$ x $0.4^{\circ}$ visual angle; the longest four-word string
(\textit{becoming increasingly clear that}) subtended $\sim8^{\circ}$ x
$0.4^{\circ}$ visual angle. At the end of each trial (i.e., after having seen
six four-word sequences), participants were prompted to type in as many
sequences as they could recall.
}}
{{\subsection{EEG Recordings and Processing}
Electroencephalogram (EEG) recordings were made with active Ag/AgCl electrodes
from 32 locations according to the international 10/20 system
(\texttt{www.biosemi.com/headcap.htm}) at the midline (Fz, Cz, Pz, Oz) and left
and right hemisphere (Fp1, Fp2, AF3, AF4, F3, F4, F7, F8, FC1, FC2, FC5, FC6,
C3, C4, T7, T8, CP1, CP2, CP5, CP6, P3, P4, P7, P8, PO3, PO4, O1, O2), as well
as the right and left mastoids, and referenced online at the common mode sense
active electrode. Electrodes were mounted on a nylon cap. Eye movements were
monitored by electrodes placed above and below the left eye and at the outer
canthi of both eyes, which were bipolarized off-line to yield vertical and
horizontal electro-oculograms (EOG). Analog signals were sampled at 8,192 Hz
using a BioSemi (Amsterdam, The Netherlands) Active II digital 24 bits
amplification system with an active input range of $\pm262$ mV per bit and were
band-pass filtered between 0.01 and 100 Hz. Note that Biosemi uses active
electrodes and therefore can tolerate high scalp
impedances.\footnotemark\footnotetext{\textcolor{red}{As mentioned in}
\texttt{http://www.biosemi.nl/forum/viewtopic.php?t=486\&highlight=impedance},
\textcolor{red}{for active electrodes ``EEG currents do not (significantly)
flow via the electrodes because the input impedance of all current biopotential
amplifiers is very high. So, typical electrode impedances (smaller than a few
hundred kOhm) do not influence measured EEG voltages''. Our impedances were
around 25 Ohms.}}
The digitized EEG was initially processed off-line using \texttt{Analyzer}
version 1.05 (Brain Products GmbH, Gilching, Germany): It was re-referenced to
the a\-ve\-rage of the right and left mastoids, downsampled to 128 Hz
\textcolor{red}{(using an {\tt Analyzer} script)}, band-pass filtered from 0.01
to 30 Hz using a forward-backward filter combination where each of the filters
was comprised of a two pole zero phase infinite impulse response Butterworth
filter, and corrected for eye movements and eye blinks by regressing out the
vertical and horizontal EOGs \cite{Gratton:Coles:Donchin:1983}. The processed
signal was then segmented into epochs of 3000 ms (1500 ms before and after
stimulus onset). Each epoch was baseline corrected on the 1500 ms segment
immediately preceding stimulus onset.
An {\tt mp4} movie of the scalp topographies through time can be downloaded
from \texttt{http://hdl.handle.net/10222/22146}. Figure \ref{fig:topomap141ms}
\textcolor{red}{is an animation of the scalp topography through time. The very
first frame of the animation} depicts the scalp topography at time $t = 141$ ms
where the N1a was maximal. It is apparent from the scalp animation as well as
from the bottom panel of Figure \ref{fig:topomap141ms}, that a negative
deflection around electrode Fz --- the N1a --- began at $t \thicksim 78$ ms,
peaked at time $t = 141$ ms ($\thicksim -5 \mu V$), and returned to baseline at
time $t \thicksim 180$ ms. Our window for analysis was thus chosen as 80--180
ms.
{{% R code: set some options
{\footnotesize
% eval=FALSE
<>=
options(width=60)
options(continue=" ")
@
}
}}
{{% R code: install and load libraries
{\footnotesize
% eval=FALSE
<>=
### IF YOU ARE MAKING THIS DOCUMENT FOR THE FIRST TIME,
### SET ALL THE EVAL'S TO TRUE. IT WILL TAKE QUITE A BIT
### OF TIME TO MAKE. THEN RESET THE EVAL'S THAT ARE MARKED
### FALSE TO FALSE TO SAVE TIME WHEN RE-MAKING.
if(!try(require(LMERConvenienceFunctions, quietly = TRUE))){
install.packages("LMERConvenienceFunctions", dependencies = TRUE,
repos = "http://mirror.its.dal.ca/cran/")
}
library(LMERConvenienceFunctions)
#
# Only available from dalspace at the moment.
if(!try(require(nlEEG, quietly = TRUE))){
tmp.dir <- tempdir()
download.file(url = "http://dalspace.library.dal.ca/bitstream/handle/10222/22146/nlEEG_1.0.tar.gz?sequence=1", destfile = file.path(tmp.dir,
"nlEEG_1.0.tar.gz"), method = "auto")
install.packages(file.path(tmp.dir, "nlEEG_1.0.tar.gz"),
repos = NULL)
file.remove(file.path(tmp.dir, "nlEEG_1.0.tar.gz"))
}
library(nlEEG)
#
if(!try(require(xtable, quietly = TRUE))){
install.packages("xtable", dependencies = TRUE,
repos = "http://mirror.its.dal.ca/cran/")
}
library(xtable)
#
if(!try(require(fields, quietly = TRUE))){
install.packages("fields", dependencies = TRUE,
repos = "http://mirror.its.dal.ca/cran/")
}
library(fields)
#
if(!try(require(icaOcularCorrection, quietly = TRUE))){
install.packages("icaOcularCorrection", dependencies = TRUE,
repos = "http://mirror.its.dal.ca/cran/")
}
library(icaOcularCorrection)
#
if(!try(require(gamair, quietly = TRUE))){
install.packages("gamair", dependencies = TRUE,
repos = "http://mirror.its.dal.ca/cran/")
}
library(gamair)
#
dir.create("data")
dir.create("models")
dir.create("tables")
dir.create("smry")
@
}
}}
{{% R code: format data into long format
{\footnotesize
% eval=FALSE
<>=
### will need about 6--8 GB of RAM to do this portion
# load data set
data(eegWide)
dat<-eegWide
rm(eegWide)
gc(TRUE,TRUE)
data(vars)
dat <- merge(dat[,c("Subject", "Item", "Stimulus", "Time", "Trial", "Fp1",
"Fp2", "AF3", "AF4", "F7", "F3", "Fz", "F4", "F8", "FC5", "FC1", "FC2",
"FC6", "T7", "C3", "Cz", "C4", "T8", "CP5", "CP1", "CP2", "CP6", "P7", "P3",
"Pz", "P4", "P8", "PO3", "PO4", "O1", "Oz", "O2")],vars[,c("Item", "Subject",
"WMC", "Position", "LengthB", "Recalled")], by = c("Item", "Subject"))
#
# put in long format
electrodes <- c("Fp1", "Fp2", "AF3", "AF4", "F7", "F3", "Fz", "F4", "F8",
"FC5", "FC1", "FC2", "FC6", "T7", "C3", "Cz", "C4", "T8", "CP5", "CP1",
"CP2", "CP6", "P7", "P3", "Pz", "P4", "P8", "PO3", "PO4", "O1", "Oz", "O2")
temp <- dat[, c("Subject", "Item", "Stimulus", "Time", "Trial",
"Position", "WMC", "LengthB", "Recalled")]
for(e in 1:length(electrodes)){
cat(e, "\n")
if(e == 1){
temp2 <- cbind(temp, dat[, electrodes[e]])
colnames(temp2)[ncol(temp2)] <- "Amplitude"
temp2$Electrode <- electrodes[e]
}else{
temp3 <- cbind(temp, dat[, electrodes[e]])
colnames(temp3)[ncol(temp3)] <- "Amplitude"
temp3$Electrode <- electrodes[e]
temp2 <- rbind(temp2, temp3)
rm(temp3); gc(TRUE, TRUE)
}
}
dat <- temp2
dat$Electrode <- as.factor(dat$Electrode)
dat <- dat[order(dat$Subject, dat$Electrode, dat$Position, dat$Time), ]
subj<-sort(unique(dat$Subject))
for(i in subj){
cat("processing subject", i, "\n")
tmp <- dat[dat$Subject == i, , drop=TRUE]
tmp$newfact <- paste(tmp$Block, tmp$Position, sep = "_")
newvec <- vector("numeric")
for(j in 1:length(unique(tmp$newfact))){
cat("\tnewfact",j,"\n")
newvec <- c(newvec, rep(j, nrow(tmp[tmp$newfact ==
unique(tmp$newfact)[j], ])))
}
tmp$Trial <- newvec
if(grep(i, subj)[1] == 1){
newdat <- tmp
}else{
newdat <- rbind(newdat, tmp)
}
}
dat<-newdat
# per subject trimming
dat <- perSubjectTrim.fnc(dat, response = "Amplitude",
subject = "Subject", trim = 2.5)$data
# n.removed = 186565
# percent.removed = 1.800765
#
dat <- dat[ , c("Subject", "Item", "Time", "Trial", "Position", "WMC",
"LengthB", "Electrode", "Amplitude", "Recalled")]
# save trimmed data
save(dat, file = "data/eegLong.rda", compress = "xz")
@
}
}}
{{% R code: create topomaps of averaged data
{\footnotesize
% eval=FALSE
<>=
# produce plot
data(eegLong)
dat<-eegLong
rm(eegLong)
gc(TRUE,TRUE)
# create average
electrodes <- unique(dat$Electrode)
time <- sort(unique(dat$Time))
for(i in 1:length(electrodes)){
cat("processing electrode", electrodes[i], "\n")
tmp <- dat[dat$Electrode == electrodes[i], ]
x <- tapply(tmp$Amplitude, tmp$Time, mean)
if(i == 1){
avg <- data.frame(Time = time, Amplitude = x, Electrode = electrodes[i])
}else{
avg <- rbind(avg, data.frame(Time = time, Amplitude = x,
Electrode = electrodes[i]))
}
}
rownames(avg) <- 1:nrow(avg)
#
# define cartesian coordinate for Biosemi 32 electrodes
coords <- data.frame(x = c(-0.496189,-0.545830,-1.299041,-0.659023,-0.394923,
-1.173172,-1.605703,-0.802851,-0.394923,-1.173172,-1.299041,-0.659023,
0.000000,-0.545830,-0.496189,0.000000,0.496189,0.545830,0.659023,1.299041,
1.173172,0.394923,0.802851,1.605703,1.173172,0.394923,0.659023,1.299041,
0.545830,0.496189,0.000000,0.000000),y=c(1.527114,1.170536,0.943808,
0.813825,0.394923,0.450338,-0.000000,-0.000000,-0.394923,-0.450338,
-0.943808,-0.813825,-0.802851,-1.170536,-1.527114,-1.605703,-1.527114,
-1.170536,-0.813825,-0.943808,-0.450338,-0.394923,0.000000,0.000000,
0.450338,0.394923,0.813825,0.943808,1.170536,1.527114,0.802851,0.000000),
Channel = c("Fp1","AF3","F7", "F3","FC1","FC5", "T7","C3","CP1","CP5",
"P7","P3","Pz","PO3","O1","Oz","O2", "PO4","P4","P8", "CP6","CP2","C4",
"T8","FC6","FC2","F4","F8","AF4","Fp2", "Fz","Cz"))
#
# merge
avg <- merge(avg, coords, by.x = "Electrode", by.y = "Channel")
rownames(avg) <- 1:nrow(avg)
save(avg, file = "data/avg.rda", compress = "xz")
#
# model the topomaps
m <- gam(Amplitude ~ te(Time, x, y, bs = "cs", k = c(40, 5, 5)), data = avg)
# save model
save(m, file = "models/topomap_with_baseline_model.rda", compress = "xz")
#
###########################################################
### create frames for movie
load("models/topomap_with_baseline_model.rda")
load("data/avg.rda")
dat <- m$model
time <- sort(unique(dat$Time))
zlimit = c(-5.5, 5.5)
ylimit = c(-4.5, 4.5)
coords <- data.frame(x = c(-0.496189,-0.545830,-1.299041,-0.659023,-0.394923,
-1.173172,-1.605703,-0.802851,-0.394923,-1.173172,-1.299041,-0.659023,
0.000000,-0.545830,-0.496189,0.000000,0.496189,0.545830,0.659023,1.299041,
1.173172,0.394923,0.802851,1.605703,1.173172,0.394923,0.659023,1.299041,
0.545830,0.496189,0.000000,0.000000),y=c(1.527114,1.170536,0.943808,
0.813825,0.394923,0.450338,-0.000000,-0.000000,-0.394923,-0.450338,
-0.943808,-0.813825,-0.802851,-1.170536,-1.527114,-1.605703,-1.527114,
-1.170536,-0.813825,-0.943808,-0.450338,-0.394923,0.000000,0.000000,
0.450338,0.394923,0.813825,0.943808,1.170536,1.527114,0.802851,0.000000),
Channel = c("Fp1","AF3","F7", "F3","FC1","FC5", "T7","C3","CP1","CP5",
"P7","P3","Pz","PO3","O1","Oz","O2", "PO4","P4","P8", "CP6","CP2","C4",
"T8","FC6","FC2","F4","F8","AF4","Fp2", "Fz","Cz"))
electrodes <- as.character(coords$Channel)
np <- 400
#pal <- colorRampPalette(c("blue","white","red"))(75)
pal <- tim.colors(200)
x <- seq(min(coords$x)-1, max(coords$x)+1, length.out = np)
y <- seq(min(coords$y)-1, max(coords$y)+1, length.out = np)
library(fields)
library(icaOcularCorrection)
for(i in 1:length(time)){
cat("processing Time",i,"\n")
png(filename = paste("figs/topomaps/Time",i,".png",sep=""),
height=960,width=960)
newdat <- expand.grid(x, y, Time = time[i])
colnames(newdat) <- c("x","y","Time")
X <- predict(m, newdat, type = "lpmatrix")
X <- X %*% coef(m)
# create average matrix
newdat <- data.frame(newdat[, c("x", "y")], Amplitude = X)
mat<-tapply(newdat$Amplitude, list(newdat$x, newdat$y), mean)
xm<-as.numeric(rownames(mat))
ym<-as.numeric(colnames(mat))
#
layout(matrix(c(1,1,1,1,1,1,2,2), 4, 2, byrow = TRUE))
par(mar = c(0.1, 5.1, 3.1, 17.1))
image(xm,ym,mat,axes=FALSE,xlab="",ylab="",col=pal,zlim=zlimit)
contour(xm,ym,mat,add=T,nlevels=25,labcex=0.01,col="black")
draw.head(which.head="biosemi.32",x=xm,y=ym)
text(coords$x,coords$y,labels=coords$Channel,cex=1.75)
image.plot(xm,ym,mat,axes=FALSE,xlab="",ylab="",col=pal,
zlim=zlimit,legend.args=list(eval(parse(text=paste("expression(",
paste("Amplitude~~(",expression(mu * V),")",sep=""),")",sep=""))),
cex=2.25,line=3.5,side=1,adj=0.9),legend.width=2,legend.shrink=0.5,
axis.args=list(cex.axis=1.75),legend.only=TRUE,graphics.reset=TRUE,
legend.mar=8.1)
close.screen(all.screens=TRUE)
par(new=TRUE)
layout(matrix(c(1,1,1,1,1,1,2,2), 4, 2, byrow = TRUE))
frame()
par(mar = c(3.1, 4.1, 0.1, 2.1))
#
# now plot subplot
for(e in electrodes){
tmp <- avg[avg$Electrode==e,]
m.tmp <- gam(Amplitude ~ s(Time, bs = "cs", k = 40),
data = tmp)
pi.tmp <- plotGAM(m.tmp, plot = FALSE)
if(which(electrodes == e) == 1){
plot(pi.tmp$x, pi.tmp$y, col = which(electrodes == e),
ylim = ylimit, type = "l", ylab = "", cex.axis = 2.25,
lwd = 2)
}else{
lines(pi.tmp$x, pi.tmp$y, col = which(electrodes == e),
lwd = 2)
}
}
abline(v=0, lty=3, lwd = 2)
abline(v = time[i], lwd = 4)
mtext(paste("Time =", round(time[i],0),"ms"),side=3,line=72,adj=0.5,cex=2)
dev.off()
}
# In a bash terminal (Linux), copy and paste the following command
# to produce an mp4 movie.
#
# avconv -y -r 15 -i Time%01d.png -c:v libx264 -preset slow -crf 21 ../Topomaps.mp4
#
# -y = overwrite
# -r 5 = 5 frames per second
# -i Time%01d.png = input files (Time1.png to the last number
# there is int he file names
# -c:v libx264 = use lib264 for encoding
# -preset slow = to get a better quality movie
# -crf 0 = the quality of the movie from 0 (the best) to 51 (the worst)
#
#
# you can also do it directly in R with the following commands:
#
setwd("figs/topomaps")
system("avconv -y -r 10 -i Time%01d.png -c:v libx264 -preset slow -crf 0 ../Topomaps.mp4")
setwd("../..")
#
#
### create PNG for time = 141 ms, i.e., figure 18.
i<-32
np <- 400
pal <- tim.colors(200)
x <- seq(min(coords$x)-1, max(coords$x)+1, length.out = np)
y <- seq(min(coords$y)-1, max(coords$y)+1, length.out = np)
time <- sort(unique(dat$Time))
zlimit = c(-5.5, 5.5)
ylimit = c(-4.5, 4.5)
library(fields)
library(icaOcularCorrection)
#
newdat <- expand.grid(x, y, Time = time[i])
colnames(newdat) <- c("x","y","Time")
X <- predict(m, newdat, type = "lpmatrix",newdata.guaranteed=TRUE)
X <- X %*% coef(m)
# create average matrix
newdat <- data.frame(newdat[, c("x", "y")], Amplitude = X)
mat<-tapply(newdat$Amplitude, list(newdat$x, newdat$y), mean)
xm<-as.numeric(rownames(mat))
ym<-as.numeric(colnames(mat))
save(mat,file="data/plotmat141ms.rda")
#
#
load("data/plotmat141ms.rda")
xm<-as.numeric(rownames(mat))
ym<-as.numeric(colnames(mat))
#
png(filename = "figs/Time141ms.png", height=960, width=960)
layout(matrix(c(1,1,1,1,1,1,2,2), 4, 2, byrow = TRUE))
par(mar = c(0.1, 5.1, 3.1, 17.1))
image(xm,ym,mat,axes=FALSE,xlab="",ylab="",col=pal,zlim=zlimit)
contour(xm,ym,mat,add=T,nlevels=25,labcex=0.01,col="black")
draw.head(which.head="biosemi.32",x=xm,y=ym)
text(coords$x,coords$y,labels=coords$Channel,cex=1.75)
text(coords[coords$Channel=="Fz",]$x,coords[coords$Channel=="Fz",]$y,
labels=coords[coords$Channel=="Fz",]$Channel,cex=2.75,col="white")
image.plot(xm,ym,mat,axes=FALSE,xlab="",ylab="",col=pal,zlim=zlimit,
legend.args=list(eval(parse(text=paste("expression(",
paste("Amplitude~~(",expression(mu * V),")",sep=""),")",sep=""))),
cex=2.25,line=3.5,side=1,adj=0.9),legend.width=2,legend.shrink=0.5,
axis.args=list(cex.axis=1.75),legend.only=TRUE,graphics.reset=TRUE,
legend.mar=8.1)
close.screen(all.screens=TRUE)
par(new=TRUE)
layout(matrix(c(1,1,1,1,1,1,2,2), 4, 2, byrow = TRUE))
frame()
par(mar = c(3.1, 4.1, 0.1, 2.1))
#
# now plot subplot
for(e in electrodes){
tmp <- avg[avg$Electrode==e,]
m.tmp <- gam(Amplitude ~ s(Time, bs = "cs", k = 40),
data = tmp)
pi.tmp <- plotGAM(m.tmp, plot = FALSE)
if(which(electrodes == e) == 1){
plot(pi.tmp$x, pi.tmp$y, col = which(electrodes == e),
ylim = ylimit, type = "l", ylab = "", cex.axis = 2.25,
lwd = 2)
}else{
lines(pi.tmp$x, pi.tmp$y, col = which(electrodes == e),
lwd = 2)
}
}
abline(v=0, lty=3, lwd = 2)
abline(v = time[i], lwd = 4)
dev.off()
@
}
}}
{{% Animation: topomap
\begin{figure}[tbh]
\centering
\animategraphics[controls,height=0.5\textheight,width=0.7\textwidth]{3}{figs/topomaps/Time}{0}{90}
\caption{Scalp topography at time $t = 141$ ms where the N1a was maximal. Blue
colors indicate negative amplitudes while red colors indicate positive
amplitudes. White indicates an amplitude of 0 $\mu V$. The bottom panel shows
the a\-ve\-rage waveform of each of the 32 electrodes overlaid on top of each
other (each electrodes is graphed with a different color). The {\it x}-axis
represents time in milliseconds and the {\it y}-axis is amplitude in $\mu V$. A
vertical black bar marks the peak of the N1a component. \textcolor{red}{An
animation depicting the evolution of the scalp topography from -100 to 600 ms
can be viewed with {\tt Acrobat Reader}. To play the animation, click on one of
the ``triangle buttons'' right below the figure. The triangle pointing toward
the left will play the animation backward. The one pointing to the right will
play it going forward. The animation can be manipulated by clicking on the
``<'' and ``>'' buttons to move the animation one frame at a time backward or
forward, respectively. The ``|<'' and ``>|'' buttons make the movie start from
the beginning or go to the end, respectively. The ``->|<-'' button to the
right restores the animation to it's default speed. The ``-'' and ``+''
buttons to its left and right make the animation go slower or faster,
respectively.}}
\label{fig:topomap141ms}
\end{figure}
%\begin{figure}[tbh]
%\centering
%\includegraphics{figs/Time141ms.png}
%\caption{Scalp topography at time $t = 141$ ms where the N1a was maximal. Blue colors indicate negative amplitudes while red colors indicate positive amplitudes. White indicates an amplitude of 0 $\mu V$. The bottom panel shows the a\-ve\-rage waveform of each of the 32 electrodes overlaid on top of each other (each electrodes is graphed with a different color). The {\it x}-axis represents time in milliseconds and the {\it y}-axis is amplitude in $\mu V$. A vertical black bar marks the peak of the N1a component.}
%\label{fig:topomap141ms}
%\end{figure}
}}
}}
}}
{{\section{Results}
{{% intro to section
The three predictor variables (fixed effects) under investigation were (1) the
position of the four-word sequence in the set of six items presented in a block
($Position$), (2) the length of the second word in the four-word chunk
($Length$)\footnote{The second word was chosen because it most often overlapped
the center of the monitor where subjects fixated between trials}, and (3) the
working memory capacity of the participant ($WMC$). These predictor variables
were transformed prior to plotting or analysis. We first mean-centered
$Position$, $Length$, and $WMC$ by subtracting the mean of a variable from each
of its values. The main purpose of this centering was to increase numerical
stability and remove any spuriously high correlation that may arise between
random intercepts and random slopes as well as between fixed and random
variables \cite{Hofman1998, Kreft1998, Baayen:2008}. We first present the
results from the behavioural analysis and then move on to the ERP analysis.
}}
{{\subsection{Behavioural Results}
We investigated whether $Length$, $Position$, $WMC$, and their interactions
affected the probability of correctly recalling a four-word sequence. In order
to be correctly recalled, a four-word sequence had to be recalled exactly. That
is, if the target sequence was {\it in the middle of}, any response other than
{\it in the middle of} was considered to be incorrect (e.g., {\it in the
middle}, {\it in the middle and}, {\it in the middle of a}, or {\it at the
middle of}). However, we accepted minor misspellings such as {\it in the mdle
of} or {\it n the midle of}.
{{% R code: behavioural analysis
% eval=FALSE
{\footnotesize
<>=
library(nlEEG)
### load data and mean centre things
data(vars)
vars$LengthBc<-vars$LengthB-mean(vars$LengthB)
vars$WMCc<-vars$WMC-mean(vars$WMC)
vars$Positionc<-vars$Position-mean(vars$Position)
vars$Recalled<-relevel(vars$Recalled,"incorrect")
### look at raw probabilities
# calculate probability for Position
x1<-table(vars$Recalled,vars$Position)
x1<-x1[2,]/(x1[2,]+x1[1,])
# calculate probability for LengthB
x2<-table(vars$Recalled,vars$LengthB)
x2<-x2[2,]/(x2[2,]+x2[1,])
# calculate probability for WMC
x3<-table(vars$Recalled,vars$WMC)
x3<-x3[2,]/(x3[2,]+x3[1,])
plot(density(vars$Position))
plot(density(vars$LengthB))
plot(density(vars$WMC))
plotDensity3d.fnc(x=sort(vars$Position),y=sort(vars$LengthBc))
plotDensity3d.fnc(x=sort(vars$Position),y=sort(vars$WMCc))
plotDensity3d.fnc(x=sort(vars$LengthBc),y=sort(vars$WMCc))
#
ylimit<-c(0,0.6)
png(filename="figs/raw_prob_behavioural.png")
par(mfrow=c(2,2))
plot(x1,type="l",xlab="Position",ylim=ylimit,ylab="")
plot(as.numeric(names(x2)),x2,type="l",xlab="LengthB",
ylim=ylimit,ylab="prob. of correctly recalling a sequence")
plot(as.numeric(names(x3)),x3,type="l",xlab="WMC",
ylim=ylimit,ylab="",xaxt="n")
wmc2<-sort(unique(vars$WMC))
wmc2<-round(seq(min(wmc2),max(wmc2),length.out=6),2)
axis(side=1,at=wmc2,labels=wmc2,cex.axis=0.9)
par(mfrow=c(1,1))
dev.off()
#
#
dim(vars)
# [1] 4260 24
### fit model, takes 2 minutes
m1<-gam(Recalled~s(Position,k=6)+s(LengthBc)+s(WMCc)+
ti(Position,LengthBc)+ti(Position,WMCc)+
ti(LengthBc,WMCc)+ti(Position,LengthBc,WMCc)+
s(Subject,bs="re"),family="binomial",data=vars)
save(m1,file="models/m1_behav.rda")
summary(m1)
# Family: binomial
# Link function: logit
#
# Formula:
# Recalled ~ s(Position, k = 6) + s(LengthBc) + s(WMCc) + ti(Position,
# LengthBc) + ti(Position, WMCc) + ti(LengthBc, WMCc) + ti(Position,
# LengthBc, WMCc) + s(Subject, bs = "re")
#
# Parametric coefficients:
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) -0.99435 0.09844 -10.1 <2e-16 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Approximate significance of smooth terms:
# edf Ref.df Chi.sq p-value
# s(Position) 4.435 4.848 375.680 < 2e-16 ***
# s(LengthBc) 8.091 8.614 36.298 2.85e-05 ***
# s(WMCc) 1.003 1.003 8.076 0.004521 **
# ti(Position,LengthBc) 5.697 7.366 12.455 0.101833
# ti(Position,WMCc) 8.813 10.702 31.698 0.000736 ***
# ti(LengthBc,WMCc) 3.704 3.916 7.609 0.101733
# ti(Position,LengthBc,WMCc) 16.872 23.672 22.827 0.510731
# s(Subject) 6.864 8.000 41.259 8.28e-08 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# R-sq.(adj) = 0.135 Deviance explained = 12.3%
# UBRE score = 0.092036 Scale est. = 1 n = 4260
### takes 3-5 minutes
m2<-update(m1,.~.+s(Item,bs="re"))
save(m2,file="models/m2_behav.rda")
(smry<-summary(m2))
save(smry,file="smry/smry_m2_behav.rda")
# Family: binomial
# Link function: logit
#
# Formula:
# Recalled ~ s(Position, k = 6) + s(LengthBc) + s(WMCc) + ti(Position,
# LengthBc) + ti(Position, WMCc) + ti(LengthBc, WMCc) + ti(Position,
# LengthBc, WMCc) + s(Subject, bs = "re") + s(Item, bs = "re")
#
# Parametric coefficients:
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) -1.0246 0.1449 -7.07 1.55e-12 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Approximate significance of smooth terms:
# edf Ref.df Chi.sq p-value
# s(Position) 4.561 4.898 366.989 < 2e-16 ***
# s(LengthBc) 7.793 8.266 23.270 0.003677 **
# s(WMCc) 1.000 1.000 4.001 0.045473 *
# ti(Position,LengthBc) 5.547 7.191 11.316 0.135617
# ti(Position,WMCc) 8.384 10.103 30.789 0.000688 ***
# ti(LengthBc,WMCc) 1.000 1.001 2.533 0.111577
# ti(Position,LengthBc,WMCc) 9.458 12.590 14.272 0.324518
# s(Subject) 7.440 8.000 46.143 5.14e-08 ***
# s(Item) 166.667 424.000 265.630 1.42e-13 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# R-sq.(adj) = 0.194 Deviance explained = 20.1%
# UBRE score = 0.071453 Scale est. = 1 n = 4260
### Do we want to include by-subject items?
AIC(m1,m2);AIC(m1)-AIC(m2)
# df AIC
# m1 56.4785 4652.074
# m2 212.8499 4564.389
#[1] 87.6857
relLik(m1,m2)
# AIC(x) AIC(y) diff relLik
# 4.652074e+03 4.564389e+03 8.768570e+01 1.098265e+19
m1$gcv.ubre;m2$gcv.ubre
# [1] 0.09203625
# [1] 0.07145276
### YES WE WANT TO INCLUDE BY-ITEM RANDOM ITEMS
### IS THE 3-WAY INTERACTION "SIGNIFICANT"?
### refit without 3-way interaction
# takes about 5 minutes to run
m3<-update(m2,.~.-ti(Position,LengthBc,WMCc))
save(m3,file="models/m3_behav.rda")
summary(m3)
# Family: binomial
# Link function: logit
#
# Formula:
# Recalled ~ s(Position, k = 6) + s(LengthBc) + s(WMCc) + ti(Position,
# LengthBc) + ti(Position, WMCc) + ti(LengthBc, WMCc) + s(Subject,
# bs = "re") + s(Item, bs = "re")
#
# Parametric coefficients:
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) -1.0207 0.1442 -7.078 1.46e-12 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Approximate significance of smooth terms:
# edf Ref.df Chi.sq p-value
# s(Position) 4.576 4.904 369.788 < 2e-16 ***
# s(LengthBc) 7.846 8.305 23.315 0.003716 **
# s(WMCc) 1.000 1.000 4.340 0.037241 *
# ti(Position,LengthBc) 5.910 7.564 10.831 0.180966
# ti(Position,WMCc) 8.539 10.278 32.214 0.000458 ***
# ti(LengthBc,WMCc) 3.862 3.981 8.113 0.086529 .
# s(Subject) 7.437 8.000 46.130 5.17e-08 ***
# s(Item) 167.023 424.000 266.665 1.23e-13 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# R-sq.(adj) = 0.193 Deviance explained = 19.8%
# UBRE score = 0.071802 Scale est. = 1 n = 4260
AIC(m2,m3);AIC(m2)-AIC(m3)
# df AIC
# m2 212.8499 4564.389
# m3 207.1934 4565.875
# [1] -1.48672
relLik(m2,m3)
# AIC(x) AIC(y) diff relLik
# 4564.38874 4565.87546 -1.48672 2.10299
m2$gcv.ubre;m3$gcv.ubre
# [1] 0.07145276
# [1] 0.07180175
### NO, IT IS NOT "SIGNIFICANT"
### IS THE Position X LengthBc INTERACTION "SIGNIFICANT"?
### refit without Position X LengthBc interaction
m4<-update(m3,.~.-ti(Position,LengthBc))
save(m4,file="models/m4_behav.rda")
summary(m4)
# Family: binomial
# Link function: logit
#
# Formula:
# Recalled ~ s(Position, k = 6) + s(LengthBc) + s(WMCc) + ti(Position,
# WMCc) + ti(LengthBc, WMCc) + s(Subject, bs = "re") + s(Item,
# bs = "re")
#
# Parametric coefficients:
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) -1.0027 0.1407 -7.125 1.04e-12 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Approximate significance of smooth terms:
# edf Ref.df Chi.sq p-value
# s(Position) 4.590 4.910 369.255 < 2e-16 ***
# s(LengthBc) 2.616 3.046 13.139 0.004598 **
# s(WMCc) 1.000 1.000 4.098 0.042926 *
# ti(Position,WMCc) 8.318 10.029 30.771 0.000655 ***
# ti(LengthBc,WMCc) 1.001 1.002 1.014 0.314309
# s(Subject) 7.415 8.000 44.961 8.57e-08 ***
# s(Item) 175.945 424.000 287.466 3.81e-15 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# R-sq.(adj) = 0.189 Deviance explained = 19.4%
# UBRE score = 0.074411 Scale est. = 1 n = 4260
AIC(m3,m4);AIC(m3)-AIC(m4)
# df AIC
# m3 207.1934 4565.875
# m4 201.8843 4576.989
# [1] -11.11395
relLik(m3,m4)
# AIC(x) AIC(y) diff relLik
# 4565.87546 4576.98941 -11.11395 259.03826
m3$gcv.ubre;m4$gcv.ubre
# [1] 0.07180175
# [1] 0.07441066
### YES, KEEP IT
### DO WE WANT THE LengthB X WMC INTERACTION?
### refit without LengthBc X WMCc interaction
m5<-update(m3,.~.-ti(LengthBc,WMCc))
save(m5,file="models/m5_behav.rda")
summary(m5)
# Family: binomial
# Link function: logit
#
# Formula:
# Recalled ~ s(Position, k = 6) + s(LengthBc) + s(WMCc) + ti(Position,
# LengthBc) + ti(Position, WMCc) + s(Subject, bs = "re") +
# s(Item, bs = "re")
#
# Parametric coefficients:
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) -1.0090 0.1399 -7.213 5.47e-13 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Approximate significance of smooth terms:
# edf Ref.df Chi.sq p-value
# s(Position) 4.557 4.897 368.380 < 2e-16 ***
# s(LengthBc) 8.083 8.477 24.315 0.002867 **
# s(WMCc) 1.000 1.001 4.296 0.038238 *
# ti(Position,LengthBc) 5.861 7.538 10.311 0.208310
# ti(Position,WMCc) 8.436 10.163 31.442 0.000563 ***
# s(Subject) 7.404 8.000 45.207 7.21e-08 ***
# s(Item) 166.233 424.000 264.513 1.88e-13 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# R-sq.(adj) = 0.191 Deviance explained = 19.6%
# UBRE score = 0.072709 Scale est. = 1 n = 4260
AIC(m3,m5);AIC(m3)-AIC(m5)
# df AIC
# m3 207.1934 4565.875
# m5 202.5750 4569.742
# [1] -3.866989
relLik(m3,m5)
# AIC(x) AIC(y) diff relLik
# 4565.875458 4569.742447 -3.866989 6.913626
m3$gcv.ubre;m5$gcv.ubre
# [1] 0.07180175
# [1] 0.07270949
### NO, WE DO NOT WANT THE LengthB X WMC INTERACTION IN THE MODEL
### DO WE WANT TO INCLUDE THE Position X LengthB INTERACTION?
### refit without Position X LengthBc interaction
m6<-update(m5,.~.-ti(Position,LengthBc))
save(m6,file="models/m6_behav.rda")
summary(m6)
# Family: binomial
# Link function: logit
#
# Formula:
# Recalled ~ s(Position, k = 6) + s(LengthBc) + s(WMCc) + ti(Position,
# WMCc) + s(Subject, bs = "re") + s(Item, bs = "re")
#
# Parametric coefficients:
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) -1.0029 0.1396 -7.182 6.89e-13 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Approximate significance of smooth terms:
# edf Ref.df Chi.sq p-value
# s(Position) 4.591 4.910 368.91 < 2e-16 ***
# s(LengthBc) 2.612 3.041 13.59 0.003716 **
# s(WMCc) 1.000 1.000 4.24 0.039508 *
# ti(Position,WMCc) 8.295 10.004 30.77 0.000642 ***
# s(Subject) 7.405 8.000 44.90 8.53e-08 ***
# s(Item) 175.967 424.000 287.27 4.26e-15 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# R-sq.(adj) = 0.189 Deviance explained = 19.4%
# UBRE score = 0.074202 Scale est. = 1 n = 4260
AIC(m5,m6);AIC(m5)-AIC(m6)
# df AIC
# m5 202.5750 4569.742
# m6 200.8701 4576.102
# [1] -6.359856
relLik(m5,m6)
# AIC(x) AIC(y) diff relLik
# 4569.742447 4576.102302 -6.359856 24.045017
m5$gcv.ubre;m6$gcv.ubre
# [1] 0.07270949
# [1] 0.07420242
### YES KEEP IT
### DO WE WANT THE Position X WMC INTERACTION?
### refit without Position X WMC interaction
m7<-update(m5,.~.-ti(Position,WMCc))
save(m7,file="models/m7_behav.rda")
summary(m7)
AIC(m5,m7);AIC(m5)-AIC(m7)
# df AIC
# m5 202.5750 4569.742
# m7 192.1037 4592.526
# [1] -22.7831
relLik(m5,m7)
# AIC(x) AIC(y) diff relLik
# 4569.7424 4592.5256 -22.7831 88570.3290
m5$gcv.ubre;m7$gcv.ubre
# [1] 0.07270949
# [1] 0.07805764
# KEEP IT
@
}
}}
{{% Table: behavioural results
% latex table generated in R 3.1.0 by xtable 1.7-3 package
% Fri Apr 11 12:06:10 2014
\begin{table}[ht]
\color{red}
\centering
\begin{tabular}{rlll}
\hline
& edf & Chi.sq & p-value \\
\hline
Pos & 4.6 & 367 & $<$ 0.001 \\
Lngth & 7.8 & 23.3 & 0.004 \\
WMC & 1 & 4 & 0.045 \\
Pos:Lngth & 5.5 & 11.3 & 0.136 \\
Pos:WMC & 8.4 & 30.8 & 0.001 \\
Lngth:WMC & 1 & 2.5 & 0.112 \\
Pos:Lngth:WMC & 9.5 & 14.3 & 0.325 \\
\hline
\end{tabular}
\caption{\textcolor{red}{Result of the behavioral analysis. \textit{Pos} =
\textit{Position}. \textit{Lngth} = \textit{Length}, \textit{WMC} =
\textit{Working Memory Capacity}. \textit{edf} = estimated degrees of
freedom.}}
\label{tab:behavResults}
\end{table}
}}
{{% Rcode: plot behavioural results
{\footnotesize
<>=
library(mgcv)
library(boot)
library(fields)
# transform intercept back to probabilities
mn<-inv.logit(summary(m5)$p.table[1,1])
jpeg(filename="figs/behavioral_results_m5.jpeg",quality=100)
layout(matrix(c(1,1,2,2,1,1,2,2,1,1,2,2,
3,3,3,3),ncol=4,byrow=TRUE))
par(mar=c(6.1,3.6,4.1,1.1))
#mypal<-colorRampPalette(c("white","grey","black"))
mypal<-tim.colors
alpha<-180
zlimit<-c(0,1)
se<-2
# Position X Length
# get
pi<-visgam(m5,view=c("LengthBc","Position"),trans=inv.logit,
plot=FALSE,se=se)
sigmat<-sig2d(pi,ref=mn,plot.type="n")
image(pi$xm+mean(vars$LengthB),pi$ym,pi$mat,col=mypal(200),
xlab="LengthB",ylab="",zlim=zlimit,main="",
cex.axis=1.2,cex.lab=1.2,cex=1.2)
mtext(side=3,text="(A) Position X LengthB",cex=1.1,font=2,line=1)
mtext(side=2,text="Position",line=2,cex=0.95)
contour(pi$xm+mean(vars$LengthB),pi$ym,pi$mat,add=TRUE,labcex=0.85,
nlevels=10)
sigmat<-sigmat$sigmat
sigmat[sigmat==1]<-2
sigmat[is.na(sigmat)]<-1
sigmat[sigmat==2]<-NA
sigmat[,which(colnames(sigmat)>5)]<-NA
image(pi$xm+mean(vars$LengthB),pi$ym,sigmat,add=TRUE,
col=rgb(255,250,240,alpha,maxColorValue=255))
# Position X WMC
par(mar=c(6.1,3.1,4.1,1.1))
pi<-visgam(m5,view=c("WMCc","Position"),trans=inv.logit,
plot=FALSE,se=se)
sigmat<-sig2d(pi,ref=mn,plot.type="n")
image(pi$xm+mean(vars$WMC),pi$ym,pi$mat,col=mypal(200),
xlab="WMC",ylab="",zlim=zlimit,main="",
cex.axis=1.2,cex.lab=1.2,cex=1.2)
mtext(side=3,text="(B) Position X WMC",cex=1.1,font=2,line=1)
mtext(side=2,text="Position",line=2,cex=0.95)
contour(pi$xm+mean(vars$WMC),pi$ym,pi$mat,add=TRUE,labcex=0.85,
nlevels=10)
sigmat<-sigmat$sigmat
sigmat[sigmat==1]<-2
sigmat[is.na(sigmat)]<-1
sigmat[sigmat==2]<-NA
sigmat[,which(colnames(sigmat)>5)]<-NA
image(pi$xm+mean(vars$WMC),pi$ym,sigmat,add=TRUE,
col=rgb(255,250,240,alpha,maxColorValue=255))
# add legend
par(mar=c(5.1,0.1,0,0.1))
plot(0,type="n",ann=FALSE,axes=FALSE,frame.plot=FALSE)
image.plot(pi$xm+mean(vars$WMC),pi$ym,pi$mat,
col=mypal(200),legend.only=TRUE,zlim=zlimit,horizontal=TRUE,
legend.width=3,legend.line=-10,
legend.args=list(text="Probability of correct recall",line=1,
cex=1.1),legend.mar=23.1)
dev.off()
@
}
}}
{{% Figure: behavioral results
\begin{figure}[h]
\centering
\includegraphics[width=1\textwidth,height=0.65\textheight]{figs/behavioral_results_m5.jpeg}
\caption{Results of the behavioural analysis. (A): Effect of
\textcolor{red}{the $Position \times Length$ interaction on the probability of
correctly recalling a four-word sequence. (B): Effect of the $Position \times
WMC$ interaction. Bluer colors reflect smaller probabilities of correctly
recalling a sequence whereas redder ones indicate higher probabilities. The
numbers appearing on the contour lines are recall probabilities. Regions shaded
in white included a probability of correct recall of 0 in their 95\% confidence
interval.}}
\label{fig:behaviouralResults}
\end{figure}
}}
The optimal GAMM model is \textcolor{red}{summarized in
Table~\ref{tab:behavResults}. The model included significant $Position \times
Length$ and $Position \times WMC$ interactions. Note that although the summary
table states that {\it p}-value for the $Position \times Length$ interaction
was greater than 0.05, it can be concluded from both the AIC comparison and the
95\% confidence interval that this interaction should remain in the
model.\footnote{\textcolor{red}{We do not consider the 3-way $Position \times
Length \times WMC$ or the 2-way $Length \times WMC$ interactions given that
{\it p} > 0.05 \emph{and} they were not warranted by way of AIC comparisons.}}}
Results of the behavioural analysis are depicted in Figure
\ref{fig:behaviouralResults}. \textcolor{red}{In panel (A), it can be seen that
$Length$ had a small effect on the probability of correctly recalling a
four-word sequence. More specifically, recall probabilities were greatest when
the second word was 2 to 6 letters long at positions 5 and 6. Panel (B)
illustrates the $Position \times WMC$ interaction, which exhibited an overall
recency effect whereby more recently presented four-word sequences were more
likely to be correctly recalled.} \textcolor{red}{The recency effect} possibly
reflected an advantage in activation strength for the sequences that were
presented more recently within a block \cite{Jones2013}.
\textcolor{red}{However, it was more pronounced for participants with higher
working memory capacity scores. Furthermore, people with a WMC score below 0.7
did not reliably recall four-word sequences presented in positions 1, 2, and 3,
whereas people with a WMC score above 0.7 exhibit a small primacy effect in
position 1.}
}}
{{\subsection{ERP Results}
{{\subsection{Discretized Variables and ANOVA}
Although our main goal in this paper is to demonstrate the application of
nonlinear analysis using continuous variables, we first show the data treated
categorically. $Position$ was treated as having 6 levels while $Length$ and
$WMC$ were dichotomized as high/low based on a median split. Figure
\ref{fig:FzMain} shows waveforms and mean N1a amplitudes for each level of each
of these three variables. No effect of $Length$ is apparent in this figure.
However, there do appear to be amplitude differences for different levels of
$Position$ (middle panel) --- in particular, items that appeared first in the
lists show a more negative N1a than subsequent positions. As such, there may
be a nonlinear relationship between $Position$ and N1a amplitude. The bottom
panel also suggests the possibility of an effect of $WMC$, where participants
with a higher $WMC$ showed a more negative N1a than those with a lower working
memory capacity.
{{% R code: plotMGCVsmoothTweaked
{\footnotesize
% eval=FALSE
<>=
# code borrowed from
# https://svn.r-project.org/R-packages/trunk/mgcv/R/plots.r
plotMGCVsmoothTweaked<<-function(x,select=1,se=TRUE,se1.mult=2,
seWithMean=FALSE,partial.resids=FALSE,n=100){
i<-select
first <- x$smooth[[i]]$first.para
last <- x$smooth[[i]]$last.para
edf <- sum(x$edf[first:last]) ## Effective DoF for this term
P<-mgcv:::plot.mgcv.smooth(x$smooth[[i]],P=NULL,
data=x$model,se=se,se1.mult=se1.mult,
seWithMean=seWithMean,n=n)
p <- x$coefficients[first:last] ## relevent coefficients
offset <- attr(P$X,"offset") ## any term specific offset
## get fitted values ....
if (is.null(offset)) P$fit <- P$X%*%p else P$fit <- P$X%*%p + offset
if (!is.null(P$exclude)) P$fit[P$exclude] <- NA
if (se && P$se) { ## get standard errors for fit
## test whether mean variability to be added to variability
## (only for centred terms)
if (seWithMean && attr(x$smooth[[i]],"nCons")>0) {
X1 <- matrix(x$cmX,nrow(P$X),ncol(x$Vp),byrow=TRUE)
meanL1 <- x$smooth[[i]]$meanL1
if (!is.null(meanL1)) X1 <- X1 / meanL1
X1[,first:last] <- P$X
se.fit <- sqrt(rowSums((X1%*%x$Vp)*X1))
} else se.fit <- ## se in centred (or anyway unconstained) space only
sqrt(rowSums((P$X%*%x$Vp[first:last,first:last])*P$X))
if (!is.null(P$exclude)) P$se.fit[P$exclude] <- NA
} ## standard errors for fit completed
if (partial.resids) { P$p.resid <- fv.terms[,length(order)+i] + w.resid }
if (se && P$se) P$se <- se.fit*P$se.mult # Note multiplier
ul <- P$fit + P$se ## upper CL
ll <- P$fit - P$se ## lower CL
ylimit<-c(min(ll),max(ul))
return(invisible(list(x=P$x,y=P$fit,y1a=ll,y1b=ul,ylim=ylimit,
xlab=P$xlab,xlim=P$xlim,ylab=P$ylab,main=P$main,se.fit=P$se)))
}
@
}
}}
{{% R code: barplots and waveforms
{\footnotesize
% eval=FALSE
<>=
data(eegLong)
dat<-eegLong
rm(eegLong)
gc(TRUE,TRUE)
# mean center LengthB and dichotomize
dat$LengthBc <- dat$LengthB-mean(dat$LengthB)
dat$LengthBdc <- "short"
dat$LengthBdc[dat$LengthBc>median(dat$LengthBc,
na.rm = TRUE)] <- "long"
dat$LengthBdc <- as.factor(dat$LengthBdc)
#
# mean center WMC and dichotomize
dat$WMCc <- dat$WMC-mean(dat$WMC)
dat$WMCdc <- "low"
dat$WMCdc[dat$WMCc>median(dat$WMCc,
na.rm = TRUE)] <- "high"
dat$WMCdc <- as.factor(dat$WMCdc)
#
dat0<-dat
dat0<-dat0[dat0$Electrode=="Fz",]
rm(dat)
gc(TRUE,TRUE)
ylimit <- c(-7.5, 1)
ylimit2 <- c(-7,0)
### main effects
pdf(file="figs/FzMainEffects.pdf")
op <- par()$mar
par(mar = c(3.1, 4.1, 2.1, 3.1))
par(mfrow = c(3,2))
nk <- 25
tw <- c(80, 180)
e<-"Fz"
# Length
tmp<-dat0[dat0$Time<=200&dat0$Electrode==e,]
tmp$Position<-as.factor(tmp$Position)
tmp$newfac <- tmp$LengthBdc
avg <- tapply(tmp$Amplitude, list(tmp$newfac, tmp$Time), mean)
x <- cbind(avg[1, ], avg[2, ])
x <- cbind(x, as.numeric(rownames(x)))
x <- as.data.frame(x)
rownames(x) <- 1:nrow(x)
colnames(x) <- c(as.character(levels(tmp$newfac)), "Time")
xx <- list()
for(i in 1:nlevels(tmp$newfac)){
eval(parse(text = paste("xx[[i]] <-
plotMGCVsmoothTweaked(gam(", as.character(levels(
tmp$newfac))[i],"~s(Time, k = nk), data = x))",
sep = "")))
}
# waveforms
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "n", ylim = ylimit, xlab = "", ylab = "",
cex.axis=1.5,cex.lab=1.5,main="")
rect(tw[1], ylimit[1], tw[2], ylimit[2], density = NA,
col = grey(0.85))
par(new = TRUE)
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "l", ylim = ylimit, xlab = "",
ylab = "",cex.lab=1.5, axes=FALSE,main="")
mtext("Time",side=1,line=-1.5,cex=1.1)
mtext(eval(parse(text=paste("expression(",
paste("Amplitude~~(",expression(mu * V), ")",
sep=""),")",sep=""))),side=2,cex=1.25,line=2.5)
mtext("Length", side=3,line = 0.5,cex=1.25)
for(i in 2:nlevels(tmp$newfac)){
lines(xx[[i]]$x, xx[[i]]$y + mean(avg[i, ]), col = i)
}
abline(v=0,lty=3,col="grey")
# barplot
gr.avg <- tapply(dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$Amplitude,
dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$LengthBdc,mean)
barplot(gr.avg, col = 1:nlevels(dat0$LengthBdc),
ylab = "", ylim = ylimit2, legend.text = levels(dat0$LengthBdc),
args.legend = list(x="bottomleft", ncol = 4, cex = 1.75,
bty = "n", title = "",xjust = 1), names.arg = FALSE, cex.axis = 1.5)
# Position
#ylimit <- c(-6.5, 6.5)
#ylimit2 <- c(-3.5,0)
tmp<-dat0[dat0$Time<=200&dat0$Electrode==e,]
tmp$Position<-as.factor(tmp$Position)
tmp$newfac <- tmp$Position
avg <- tapply(tmp$Amplitude, list(tmp$newfac, tmp$Time), mean)
x <- cbind(avg[1, ], avg[2, ])
for(i in 3:nlevels(tmp$newfac)){
x <- cbind(x, avg[i, ])
}
x <- cbind(x, as.numeric(rownames(x)))
x <- as.data.frame(x)
rownames(x) <- 1:nrow(x)
colnames(x) <- c(paste("Pos",as.character(levels(tmp$newfac)),sep=""), "Time")
xx <- list()
for(i in 1:nlevels(tmp$newfac)){
eval(parse(text = paste("xx[[i]] <- plotMGCVsmoothTweaked(gam(",
paste("Pos", as.character(levels( tmp$newfac))[i],sep=""),
"~s(Time, k = nk), data = x))", sep = "")))
}
# waveform
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "n", ylim = ylimit, xlab = "", ylab = "",
cex.axis=1.5,cex.lab=1.5,main="")
rect(tw[1], ylimit[1], tw[2], ylimit[2], density = NA,
col = grey(0.85))
par(new = TRUE)
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "l", ylim = ylimit, xlab = "",
ylab = "",cex.lab=1.5, axes=FALSE,main="")
mtext("Position", side=3,line = 0.5,cex=1.25)
for(i in 2:nlevels(tmp$newfac)){
lines(xx[[i]]$x, xx[[i]]$y + mean(avg[i, ]), col = i)
}
abline(v=0,lty=3,col="grey")
# barplot
gr.avg <- tapply(dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$Amplitude,
as.factor(dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$Position),mean)
barplot(gr.avg, col = 1:nlevels(as.factor(dat0$Position)),
ylab = "", ylim = ylimit2, legend.text = levels(as.factor(dat0$Position)),
args.legend = list(x="bottomleft", ncol = 4, cex = 1.75,
bty = "n", title = "",xjust = 1), names.arg = FALSE, cex.axis = 1.5)
# WMC
#ylimit <- c(-5, 4.5)
#ylimit2 <- c(-2.75,0)
tmp<-dat0[dat0$Time<=200&dat0$Electrode==e,]
tmp$Position<-as.factor(tmp$Position)
tmp$newfac <- tmp$WMCdc
avg <- tapply(tmp$Amplitude, list(tmp$newfac, tmp$Time), mean)
x <- cbind(avg[1, ], avg[2, ])
x <- cbind(x, as.numeric(rownames(x)))
x <- as.data.frame(x)
rownames(x) <- 1:nrow(x)
colnames(x) <- c(as.character(levels(tmp$newfac)), "Time")
xx <- list()
for(i in 1:nlevels(tmp$newfac)){
eval(parse(text = paste("xx[[i]] <-
plotMGCVsmoothTweaked(gam(", as.character(levels(
tmp$newfac))[i],"~s(Time, k = nk), data = x))",
sep = "")))
}
# waveform
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "n", ylim = ylimit, xlab = "", ylab = "",
cex.axis=1.5,cex.lab=1.5,main="")
rect(tw[1], ylimit[1], tw[2], ylimit[2], density = NA,
col = grey(0.85))
par(new = TRUE)
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "l", ylim = ylimit, xlab = "",
ylab = "",cex.lab=1.5, axes=FALSE,main="")
mtext("Working Memory Capacity", side=3,line = 0.5,cex=1)
for(i in 2:nlevels(tmp$newfac)){
lines(xx[[i]]$x, xx[[i]]$y + mean(avg[i, ]), col = i)
}
abline(v=0,lty=3,col="grey")
# barplot
gr.avg <- tapply(dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$Amplitude,
dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$WMCdc,mean)
barplot(gr.avg, col = 1:nlevels(dat0$WMCdc),
ylab = "", ylim = ylimit2, legend.text = levels(dat0$WMCdc),
args.legend = list(x="bottomleft", ncol = 4, cex = 1.75,
bty = "n", title = "",xjust = 1), names.arg = FALSE, cex.axis = 1.5)
dev.off()
gc(TRUE,TRUE)
# interactions
pdf(file="figs/FzInteractions.pdf")
op <- par()$mar
par(mar = c(3.1, 4.1, 2.1, 3.1))
par(mfrow = c(4,2))
nk <- 25
tw <- c(80, 180)
e<-"Fz"
# Length X Position
#ylimit <- c(-7.5,7.5)
#ylimit2 <- c(-5.5,0)
tmp<-dat0[dat0$Time<=200&dat0$Electrode==e,]
tmp$Position<-as.factor(tmp$Position)
tmp$newfac <- as.factor(paste(tmp$LengthBdc,
tmp$Position, sep = "_"))
avg <- tapply(tmp$Amplitude, list(tmp$newfac, tmp$Time), mean)
x <- cbind(avg[1, ], avg[2, ])
for(i in 3:nlevels(tmp$newfac)){
x <- cbind(x, avg[i, ])
}
x <- cbind(x, as.numeric(rownames(x)))
x <- as.data.frame(x)
rownames(x) <- 1:nrow(x)
colnames(x) <- c(as.character(levels(tmp$newfac)), "Time")
xx <- list()
for(i in 1:nlevels(tmp$newfac)){
eval(parse(text = paste("xx[[i]] <-
plotMGCVsmoothTweaked(gam(", as.character(levels(
tmp$newfac))[i],"~s(Time, k = nk), data = x))",
sep = "")))
}
# waveform
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "n", ylim = ylimit, xlab = "", ylab = "",
cex.axis=1.5,cex.lab=1.5,main="")
rect(tw[1], ylimit[1], tw[2], ylimit[2], density = NA,
col = grey(0.85))
par(new = TRUE)
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "l", ylim = ylimit, xlab = "",
ylab = "",cex.lab=1.5, axes=FALSE,main="")
mtext("Time",side=1,line=-1.5,cex=1.1)
mtext(eval(parse(text=paste("expression(",
paste("Amplitude~~(",expression(mu * V), ")",
sep=""),")",sep=""))),side=2,cex=1.25,line=2.5)
mtext("Length X Position", side=3,line = 0.5,cex=1.1)
for(i in 2:nlevels(tmp$newfac)){
lines(xx[[i]]$x, xx[[i]]$y + mean(avg[i, ]), col = i)
}
abline(v=0,lty=3,col="grey")
dat0$newfac<-as.factor(paste(dat0$LengthBdc,
as.factor(dat0$Position),sep="_"))
# barplot
gr.avg <- tapply(dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$Amplitude,
dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$newfac,mean)
barplot(gr.avg, col = 1:nlevels(dat0$newfac),
ylab = "", ylim = ylimit2, legend.text = levels(dat0$newfac),
args.legend = list(x="bottomleft", ncol = 4, cex = 1,
bty = "n", title = "",xjust = 1), names.arg = FALSE, cex.axis = 1.5)
rm(avg,gr.avg,tmp,xx,x)
gc(TRUE,TRUE)
# Length $\times$ WMC
#ylimit <- c(-5,5)
#ylimit2 <- c(-3,0)
tmp<-dat0[dat0$Time<=200&dat0$Electrode==e,]
tmp$newfac <- as.factor(paste(tmp$LengthBdc,
tmp$WMCdc, sep = "_"))
avg <- tapply(tmp$Amplitude, list(tmp$newfac, tmp$Time), mean)
x <- cbind(avg[1, ], avg[2, ])
for(i in 3:nlevels(tmp$newfac)){
x <- cbind(x, avg[i, ])
}
x <- cbind(x, as.numeric(rownames(x)))
x <- as.data.frame(x)
rownames(x) <- 1:nrow(x)
colnames(x) <- c(as.character(levels(tmp$newfac)), "Time")
xx <- list()
for(i in 1:nlevels(tmp$newfac)){
eval(parse(text = paste("xx[[i]] <-
plotMGCVsmoothTweaked(gam(", as.character(levels(
tmp$newfac))[i],"~s(Time, k = nk), data = x))",
sep = "")))
}
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "n", ylim = ylimit, xlab = "", ylab = "",
cex.axis=1.5,cex.lab=1.5,main="")
rect(tw[1], ylimit[1], tw[2], ylimit[2], density = NA,
col = grey(0.85))
par(new = TRUE)
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "l", ylim = ylimit, xlab = "",
ylab = "",cex.lab=1.5, axes=FALSE,main="")
mtext("Length X WMC", side=3,line = 0.5,cex=1.1)
for(i in 2:nlevels(tmp$newfac)){
lines(xx[[i]]$x, xx[[i]]$y + mean(avg[i, ]), col = i)
}
abline(v=0,lty=3,col="grey")
# barplot
dat0$newfac<-as.factor(paste(dat0$LengthBdc,dat0$WMCdc,sep="_"))
gr.avg <- tapply(dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$Amplitude,
dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$newfac,mean)
barplot(gr.avg, col = 1:nlevels(dat0$newfac),
ylab = "", ylim = ylimit2, legend.text = levels(dat0$newfac),
args.legend = list(x="bottomleft", ncol = 2, cex = 1.75,
bty = "n", title = "",xjust = 1), names.arg = FALSE,cex.axis=1.5)
rm(avg,gr.avg,tmp,xx,x)
gc(TRUE,TRUE)
# WMC X Position
#ylimit <- c(-7.5,7.5)
#ylimit2 <- c(-5,0)
tmp<-dat0[dat0$Time<=200&dat0$Electrode==e,]
tmp$Position<-as.factor(tmp$Position)
tmp$newfac <- as.factor(paste(tmp$WMCdc,
tmp$Position, sep = "_"))
avg <- tapply(tmp$Amplitude, list(tmp$newfac, tmp$Time), mean)
x <- cbind(avg[1, ], avg[2, ])
for(i in 3:nlevels(tmp$newfac)){
x <- cbind(x, avg[i, ])
}
x <- cbind(x, as.numeric(rownames(x)))
x <- as.data.frame(x)
rownames(x) <- 1:nrow(x)
colnames(x) <- c(as.character(levels(tmp$newfac)), "Time")
xx <- list()
for(i in 1:nlevels(tmp$newfac)){
eval(parse(text = paste("xx[[i]] <-
plotMGCVsmoothTweaked(gam(", as.character(levels(
tmp$newfac))[i],"~s(Time, k = nk), data = x))",
sep = "")))
}
# waveform
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "n", ylim = ylimit, xlab = "", ylab = "",
cex.axis=1.5,cex.lab=1.5,main="")
rect(tw[1], ylimit[1], tw[2], ylimit[2], density = NA,
col = grey(0.85))
par(new = TRUE)
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "l", ylim = ylimit, xlab = "",
ylab = "",cex.lab=1.5, axes=FALSE,main="")
mtext("WMC X Position", side=3,line = 0.5,cex=1.1)
for(i in 2:nlevels(tmp$newfac)){
lines(xx[[i]]$x, xx[[i]]$y + mean(avg[i, ]), col = i)
}
abline(v=0,lty=3,col="grey")
# barplot
dat0$newfac<-as.factor(paste(dat0$WMCdc,
as.factor(dat0$Position),sep="_"))
gr.avg <- tapply(dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$Amplitude,
dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$newfac,mean)
barplot(gr.avg, col = 1:nlevels(dat0$newfac),
ylab = "",
ylim = ylimit2, legend.text = levels(dat0$newfac),
args.legend = list(x="bottomleft", ncol = 4, cex = 1,
bty = "n", title = "",xjust = 1), names.arg = FALSE,cex.axis=1.5)
rm(avg,gr.avg,tmp,xx,x)
gc(TRUE,TRUE)
# Length X Position X WMC
#ylimit <- c(-7.5,7.5)
#ylimit2 <- c(-7,0)
tmp<-dat0[dat0$Time<=200&dat0$Electrode==e,]
tmp$Position<-as.factor(tmp$Position)
tmp$newfac <- as.factor(paste(tmp$LengthBdc,
tmp$Position, tmp$WMCdc, sep = "_"))
avg <- tapply(tmp$Amplitude, list(tmp$newfac, tmp$Time), mean)
x <- cbind(avg[1, ], avg[2, ])
for(i in 3:nlevels(tmp$newfac)){
x <- cbind(x, avg[i, ])
}
x <- cbind(x, as.numeric(rownames(x)))
x <- as.data.frame(x)
rownames(x) <- 1:nrow(x)
colnames(x) <- c(as.character(levels(tmp$newfac)), "Time")
xx <- list()
for(i in 1:nlevels(tmp$newfac)){
eval(parse(text = paste("xx[[i]] <-
plotMGCVsmoothTweaked(gam(", as.character(levels(
tmp$newfac))[i],"~s(Time, k = nk), data = x))",
sep = "")))
}
# waveform
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "n", ylim = ylimit, xlab = "", ylab = "",
cex.axis=1.5,cex.lab=1.5,main="")
rect(tw[1], ylimit[1], tw[2], ylimit[2], density = NA,
col = grey(0.85))
par(new = TRUE)
plot(xx[[1]]$x, xx[[1]]$y + mean(avg[1, ]),
type = "l", ylim = ylimit, xlab = "",
ylab = "",cex.lab=1.5, axes=FALSE,main="")
mtext("Length X Position X WMC", side=3,line = 0.5,cex=1)
for(i in 2:nlevels(tmp$newfac)){
lines(xx[[i]]$x, xx[[i]]$y + mean(avg[i, ]), col = i)
}
abline(v=0,lty=3,col="grey")
# barplot
dat0$newfac<-as.factor(paste(dat0$LengthBdc,as.factor(dat0$Position),
dat0$WMCdc,sep="_"))
gr.avg <- tapply(dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$Amplitude,
dat0[dat0$Time>=tw[1]&dat0$Time<=tw[2],]$newfac,mean)
barplot(gr.avg, col = 1:nlevels(dat0$newfac),
ylab = "", ylim = c(-9,0), legend.text = levels(dat0$newfac),
args.legend = list(x="bottomleft", ncol = 4, cex = 0.785,
bty = "n", title = "",xjust = 1, yjust = 1), names.arg = FALSE,
cex.axis = 1.5)
rm(avg,gr.avg,tmp,xx,x)
gc(TRUE,TRUE)
par(mfrow = c(1, 1))
par(mar = op)
dev.off()
@
}
}}
{{% Figure: main effects -- dichotomous
\begin{figure}[h]
\centering
\includegraphics{figs/FzMainEffects.pdf}
\caption{Waveforms (right panels) and mean amplitudes (left panel) of the N1a
at electrode Fz, in the 80--180 ms window, plotted separately for discretized
levels of each independent variable. The {\it x}-axis represents time in
milliseconds and the {\it y}-axis is amplitude in $\mu V$.}
\label{fig:FzMain}
\end{figure}
}}
{{% Rcode: traditional ANOVA
{\footnotesize
% EVAL=FALSE
<>=
# perform traditional anova
data(eegLong)
dat<-eegLong
rm(eegLong)
gc(TRUE,TRUE)
# mean center LengthB and dichotomize
dat$LengthBc <- dat$LengthB-mean(dat$LengthB)
dat$LengthBdc <- "short"
dat$LengthBdc[dat$LengthBc>median(dat$LengthBc,
na.rm = TRUE)] <- "long"
dat$LengthBdc <- as.factor(dat$LengthBdc)
#
# mean center WMC and dichotomize
dat$WMCc <- dat$WMC-mean(dat$WMC)
dat$WMCdc <- "low"
dat$WMCdc[dat$WMCc>median(dat$WMCc,
na.rm = TRUE)] <- "high"
dat$WMCdc <- as.factor(dat$WMCdc)
#
dat$Position<-as.factor(dat$Position)
dat<-dat[dat$Electrode=="Fz",]
dat <- dat[dat$Time >= 80 & dat$Time <= 180, ]
gc()
#
# aggregate data
dat<-aggregate(dat$Amplitude,list(Subject=dat$Subject,
Position=dat$Position,Length=dat$LengthBdc,
WMC=dat$WMCdc),mean)
colnames(dat)[which(colnames(dat)=="x")]<-"Amplitude"
#
# analyze data
m0<-lmer(Amplitude~Position*Length*WMC+(1|Subject),data=dat)
m1<-lmer(Amplitude~Position*Length*WMC+(Length|Subject),data=dat)
m2<-lmer(Amplitude~Position*Length*WMC+(Position|Subject),data=dat)
m3<-lmer(Amplitude~Position*Length*WMC+(Length|Subject)+
(Position|Subject),data=dat)
AIC(m0,m1,m2,m3)
# df AIC
# m0 26 305.7619
# m1 28 308.5593
# m2 46 332.6488
# m3 49 338.5117
#
pamer.fnc(m0)
# Df Sum Sq Mean Sq F value upper.den.df upper.p.val
# Position 5 14.8777 2.9755 6.1729 96 0.0001
# Length 1 0.8006 0.8006 1.6608 96 0.2006
# WMC 1 0.0724 0.0724 0.1502 96 0.6992
# Position:Length 5 1.7569 0.3514 0.7290 96 0.6034
# Position:WMC 5 2.6159 0.5232 1.0854 96 0.3735
# Length:WMC 1 0.6541 0.6541 1.3570 96 0.2469
# Position:Length:WMC 5 0.6817 0.1363 0.2828 96 0.9215
# lower.den.df lower.p.val expl.dev.(%)
# Position 86 0.0001 17.6326
# Length 86 0.2010 0.9488
# WMC 86 0.6993 0.0858
# Position:Length 86 0.6036 2.0823
# Position:WMC 86 0.3743 3.1003
# Length:WMC 86 0.2473 0.7752
# Position:Length:WMC 86 0.9213 0.8079
m1<-bfFixefLMER_F.fnc(m)
pamer.fnc(m1)
# Df Sum Sq Mean Sq F value upper.den.df upper.p.val lower.den.df
# Position 5 7.0645 1.4129 3.9422 114 0.0025 44
# lower.p.val expl.dev.(%)
# Position 0.0049 8.3727
plotLMER.fnc(m1,pred="Position",addlines=TRUE)
# effect size (range) for Position is 1.027624
@
}
}}
Figure \ref{fig:FzInter} shows waveforms and mean amplitudes for two- and
three-way combinations of levels of the independent variables (corresponding to
the two- and three-way interactions in an ANOVA model). \textcolor{red}{There
is a possible} {\it Length} $\times$ {\it Position} interaction, where long
words in first position \textcolor{red}{may have} elicited a more negative N1a
than short words in first position. There \textcolor{red}{also} may be a {\it
Length} $\times$ {\it WMC} interaction and a {\it WMC} $\times$ {\it Position}
interaction. In the former case, shorter words \textcolor{red}{may have}
elicited a smaller N1a in participants with a low working memory capacity. In
the latter case, the position effect on N1a amplitudes \textcolor{red}{may have
been} stronger in participants with high than low working memory capacity.
Finally, there is possibly a three-way interaction (bottom panel), where long
words in the first position elicited a more negative N1a in participants with
low than high working memory capacity (the leftmost black and red bars).
\textcolor{red}{We performed a repeated measures ANOVA to test these
observations. It revealed that only the main effect of $Position$ was
reliable ($F$(5,96) = 6.2, $p$ < 0.001).}
{{% Figure: interactions
\begin{figure}[h]
\centering
\includegraphics{figs/FzInteractions.pdf}
\caption{Two-way and three-way interactions between dichotomized levels of {\it
Length}, {\it Position} and {\it WMC}. Details are as in Figure
\ref{fig:FzMain}.}
\label{fig:FzInter}
\end{figure}
}}
}}
{{\subsection{GAMM Analysis Assuming Linearity}
{{% R code: gamm models assuming linearity
{\footnotesize
% eval=FALSE
<>=
library(nlEEG)
data(eegLong)
dat<-eegLong
rm(eegLong)
gc(TRUE,TRUE)
# restrict to electrode Fz and 80--180 ms window
dat <- dat[dat$Electrode == "Fz", ]
dat$Electrode<-as.factor(as.character(dat$Electrode))
dat <- dat[dat$Time >= 80 & dat$Time <= 180, ]
dat$LengthBc <- dat$LengthB - mean(dat$LengthB)
dat$WMCc <- dat$WMC - mean(dat$WMC)
dat$Position <- dat$Position - mean(dat$Position)
dat0 <- dat
#
cl<-makeCluster(8)
# fit models
m1 <- bam(Amplitude ~ Position * LengthBc * WMCc +
s(Subject,bs="re"), data = dat, samfrac=0.1,cluster=cl)
dat <- romr.fnc(m1, dat, 2.5)$data
# n.removed = 735
# percent.removed = 1.350706
m1 <- update(m1)
#
dat <- dat0
m2 <- bam(Amplitude ~ Position * LengthBc * WMCc +
s(Subject,bs="re")+s(Item,bs="re"), data = dat,
samfrac=0.1,cluster=cl)
dat <- romr.fnc(m2, dat, 2.5)$data
# n.removed = 733
# percent.removed = 1.34703
m2 <- update(m2)
AIC(m1);AIC(m2);AIC(m1)-AIC(m2)
#
save(m1, file = "models/m1_lin.rda", compress = "xz")
save(m2, file = "models/m2_lin.rda", compress = "xz")
smry<-summary(m2)
save(smry,file="smry/smry_m2_lin.rda")
@
}
}}
The present analysis \textcolor{red}{(using only the parametric component of
GAMM)} simply treated each variable as continuous. Results are shown in
Table~\ref{tab:ContSmry}. This model suggests that there was
a significant main effect of $Position$ \textcolor{red}{and a significant
$Length \times WMC$ interaction. These effects are depicted in Figure
\ref{fig:linear}.}
{{% Rcode: plot results assuming linearity
<>=
load("models/m2_lin.rda")
load("smry/smry_m2_lin.rda")
data(vars)
smry<-smry$p.table
len<-range(sort(unique(m2$model$LengthBc)))
wmc<-range(sort(unique(m2$model$WMCc)))
n<-200
newdat<-expand.grid(Length=seq(len[1],len[2],length.out=n),
WMC=seq(wmc[1],wmc[2],length.out=n))
#newdat$y1<-newdat$Length*smry["LengthBc","Estimate"]
#newdat$y2<-newdat$Length*smry["WMCc","Estimate"]
newdat$y3<-(newdat[,1]*newdat[,2])*smry["LengthBc:WMCc","Estimate"]
#newdat$y<-newdat$y1+newdat$y2+newdat$y3
newdat$y<-newdat$y3
newdat$ll<-newdat$y-2*smry["LengthBc:WMCc","Std. Error"]
newdat$ul<-newdat$y+2*smry["LengthBc:WMCc","Std. Error"]
mat<-matrix(newdat$y,ncol=n,byrow=TRUE)+smry["(Intercept)","Estimate"]
colnames(mat)<-seq(len[1],len[2],length.out=n)
rownames(mat)<-seq(wmc[1],wmc[2],length.out=n)
ulm<-matrix(newdat$ul,ncol=n,byrow=TRUE)+smry["(Intercept)","Estimate"]
colnames(ulm)<-seq(len[1],len[2],length.out=n)
rownames(ulm)<-seq(wmc[1],wmc[2],length.out=n)
llm<-matrix(newdat$ll,ncol=n,byrow=TRUE)+smry["(Intercept)","Estimate"]
colnames(llm)<-seq(len[1],len[2],length.out=n)
rownames(llm)<-seq(wmc[1],wmc[2],length.out=n)
xm<-as.numeric(colnames(mat))
ym<-as.numeric(rownames(mat))
sigmat<-sig2d(xm,ym,mat,llm,ulm,plot.type="n",ref=0)
sigmat<-sigmat$sigmat
pdf(file="figs/linear_results.pdf")
zlimit<-c(-5,5)
alpha<-160
layout(matrix(c(1,1,2,2,1,1,2,2,1,1,2,2,
3,3,3,3),ncol=4,byrow=TRUE))
par(mar=c(6.1,3.6,4.1,1.1))
# Position
pos<-sort(unique(m2$model$Position))
plot(m2,select=3,se=2,all.terms=TRUE,rug=FALSE,seWithMean=TRUE,ann=FALSE,
axes=FALSE,frame.plot=FALSE)
par(new=TRUE)
plot(1:6,pos*smry["Position","Estimate"]+smry["(Intercept)","Estimate"],
type="n",xlab="Position",ylab="",ylim=c(-2.5,-1.7))
mtext(side=3,text="(A) Position",cex=1.1,font=2,line=1)
mtext(side=2,text=eval(parse(text=paste("expression(",paste("Amplitude~~(",
expression(mu * V),")",sep=""),")",sep=""))),line=2.25,cex=0.85)
# Position X WMC
par(mar=c(6.1,3.1,4.1,1.1))
image(xm+mean(vars$LengthB),ym+mean(vars$WMC),mat,col=tim.colors(n),
xlab="Length",ylab="WMC",zlim=zlimit,main="",
cex.axis=1.2,cex.lab=1.2,cex=1.2)
mtext(side=3,text="(B) Length X WMC",cex=1.1,font=2,line=1)
mtext(side=2,text="WMC",line=2,cex=0.95)
contour(xm+mean(vars$LengthB),ym+mean(vars$WMC),mat,add=TRUE,
labcex=0.85,nlevels=10)
image(xm+mean(vars$LengthB),ym+mean(vars$WMC),sigmat,add=TRUE,
col=rgb(255,250,240,alpha,maxColorValue=255))
# add legend
par(mar=c(5.1,0.1,0,0.1))
plot(0,type="n",ann=FALSE,axes=FALSE,frame.plot=FALSE)
image.plot(xm+mean(vars$LengthB),ym+mean(vars$WMC),mat,
col=tim.colors(n),legend.only=TRUE,zlim=zlimit,horizontal=TRUE,
legend.width=3,legend.line=-10,
legend.args=list(text=eval(parse(text=paste("expression(",
paste("Amplitude~~(",expression(mu * V),")",sep=""),")",
sep=""))),line=1,cex=1.1),legend.mar=23.1)
dev.off()
@
}}
{{% Figure: ERP analysis assuming linearity
\begin{figure}[h]
\centering
\includegraphics{figs/linear_results.pdf}
\caption{\textcolor{red}{Results of the ERP analysis assuming linearity. In
panel (A), the \textit{x}-axis is $Position$ and the \textit{y}-axis is
$Amplitude$ ($\mu V$). The broken lines are 95\% confidence intervals. In panel
(B), the \textit{x}-axis is $Length$ and the \textit{y}-axis is $WMC$. The
amplitude of the N1a is shown using the same color coding as in Figure
\ref{fig:topomap141ms}. The scale is provided at the bottom of the figure. The
small numbers on the black lines are isovoltage lines with the voltage in
microvolts provided.}}
\label{fig:linear}
\end{figure}
}}
\textcolor{red}{The $Position$ effect shown in panel (A) suggests that the
amplitude of the N1a decreased (became more positive) as position
increased.\footnote{Because the N1a has a negative polarity, we refer to less
negative amplitudes as ``decreases'' and more negative amplitudes as
``increases''.} It is illustrated in panel (B) that the length of the second
word of a sequence modulated the amplitude of the N1a. More specifically, it
increased (became more negative) as the working memory capacity score of a
participant increased for words 1 to 6 letters long, but decreased (became more
positive) for words 7 to 12 letters long.}
While it is notable that these effects were found under the assumption of
linearity, Figure \ref{fig:FzMain} suggested that $Position$ had a nonlinear
relationship with N1a amplitude. Describing the source of this main effect
would necessitate drawing inferences over the six levels of Position from a
relatively large number of pairwise comparisons.\footnote{While contrasts such
as Helmert could also describe this relationship somewhat more parsimoniously,
we had no a priori assumption as to the shape of the relationship between
Position and N1a amplitude so the choice of appropriate contrast weights would
have to be made post hoc.} Fitting a single nonlinear function to $Position$
could simplify the description and interpretation of the effect without
recourse to cumbersome post hoc testing. In addition, it may be the case that
the assumption of linearity missed important structure in the data. In the next
section, we explore models for which linearity is {\it not} assumed.
{{% R code: results table
{\footnotesize
<>=
load("smry/smry_m2.rda")
x<-smry$p.table
x[,c(1,2)]<-round(x[,c(1,2)],3)
x[,3]<-round(x[,3],1)
x[,4]<-round(x[,4],3)
x[1:2,4]<-"< 0.001"
rownames(x)<-c("(Intercept)","Pos","Lngth","WMC","Pos:Lngth","Pos:WMC","Lngth:WMC","Pos:Lngth:WMC")
x<-x[2:nrow(x),]
colnames(x)[ncol(x)-1]<-"t-value"
colnames(x)[ncol(x)]<-"p-value"
print(xtable(x, caption = "Electrode Fz -- 80 to 180 ms. Probability (\\textit{p}) values from model for which linearity was assumed. \\textit{Pos} = \\textit{Position}. \\textit{Lngth} = \\textit{Length}, \\textit{WMC} = \\textit{Working Memory Capacity}.", label = "tab:ContSmry"))
@
}
}}
{{% Table: results assuming linearity
% latex table generated in R 3.1.0 by xtable 1.7-3 package
% Sun Apr 13 19:55:49 2014
\begin{table}[ht]
\color{red}
\centering
\begin{tabular}{r d{3} d{3} d{2} d{3}}
\hline
& Estimate & Std. Error & t-value & p-value \\
\hline
Pos & 0.142 & 0.018 & 8.0 & $<$ 0.001 \\
Lngth & -0.037 & 0.029 & -1.3 & 0.198 \\
WMC & -0.645 & 1.376 & -0.5 & 0.639 \\
Pos:Lngth & 0.005 & 0.010 & 0.5 & 0.635 \\
Pos:WMC & -0.255 & 0.167 & -1.5 & 0.125 \\
Lngth:WMC & 0.472 & 0.152 & 3.1 & 0.002 \\
Pos:Lngth:WMC & -0.034 & 0.095 & -0.4 & 0.722 \\
\hline
\end{tabular}
\caption{\textcolor{red}{Model for which linearity \textbf{was} assumed.
\textit{Pos} = \textit{Position}. \textit{Lngth} = \textit{Length},
\textit{WMC} = \textit{Working Memory Capacity}.}}
\label{tab:ContSmry}
\end{table}
}}
}}
{{\subsection{Moving Away from the Assumption of Linearity}
{{% models without factor variables -- 3-way interaction
The possibility of nonlinear relationships between our predictor variables and
N1a amplitude is supported by Figure \ref{fig:ContAvg}, in which mean N1a
amplitude at each sampled level of each variable are plotted, along with a form
of best-fitting nonlinear functions (lowess smooths).\footnote{A lowess smooth
uses locally-weighted polynomial regression. `Local' is defined as a window of
a certain span around each data point (e.g., 50 data points to the left and to
the right). Each data point within a window is influenced (i.e., weighted) by
its ``neighbours''. See {\tt ?lowess} in {\tt R} for more details.} The $Length$
effect (top left panel) appears to be curved with relatively constant (and
small) N1a amplitudes from values 1 to roughly 6, after which N1a amplitude
increases with increasing word length \textcolor{red}{(very similar to the
$Length$ effect obtained in the behavioral analysis, see panel (A) of Figure
\ref{fig:behaviouralResults})}. The effect of $Position$ (bottom left panel)
shows an opposite pattern. The amplitude of the N1a is most negative in the
first position, less so in second, third, and fourth positions, and then
increases slightly in the fifth and sixth positions. The $WMC$ effect (top
right panel) is more complex, with an initial increase in N1a amplitude from
the lowest to low-middle $WMC$ levels, followed by a decrease and then a
subsequent increase for people with the highest $WMC$.
{{% R code: create continuous averages figure
{\footnotesize
% eval=FALSE
<>=
library(nlEEG)
data(eegLong)
dat<-eegLong
rm(eegLong)
gc(TRUE,TRUE)
# restrict to electrode Fz and 80--180 ms window
dat <- dat[dat$Electrode == "Fz", ]
dat$Electrode<-as.factor(as.character(dat$Electrode))
dat <- dat[dat$Time >= 80 & dat$Time <= 180, ]
pdf(file="figs/ContinuousAverages.pdf")
par(mfrow=c(2,2))
# Amplitude ~ Length
avg<-tapply(dat$Amplitude,dat$LengthB,mean)
plot(as.numeric(names(avg)),avg,xlab="Length",ylab="",
main="Amplitude ~ Length",ylim=c(-4,-1.25),cex.lab=1.25,cex.main=1.5,
cex.axis=1.25)
mtext(eval(parse(text=paste("expression(",paste("Amplitude~~(",
expression(mu * V), ")",sep=""),")",sep=""))),side=2,cex=1.25,
line=2.25)
lines(lowess(as.numeric(names(avg)),avg),col=2)
legend("bottomleft",legend=c("average","lowess smooth"),
lty=c(NA,1),pch=c(21,NA),col=1:2,bty="n",cex=1.25)
# Amplitude ~ WMC
avg<-tapply(dat$Amplitude,dat$WMC,mean)
plot(as.numeric(names(avg)),avg,xlab="WMC",ylab="",main="Amplitude ~ WMC",
ylim=c(-2.6,-0.5),cex.lab=1.25,cex.main=1.5,cex.axis=1.25)
lines(lowess(as.numeric(names(avg)),avg),col=2)
legend("topleft",legend=c("average","lowess smooth"),
lty=c(NA,1),pch=c(21,NA),col=1:2,bty="n",cex=1.25)
# Amplitude ~ Position
avg<-tapply(dat$Amplitude,dat$Position,mean)
plot(as.numeric(names(avg)),avg,xlab="Position",ylab="",
main="Amplitude ~ Position",ylim=c(-2.75,-1.7),cex.lab=1.25,
cex.main=1.5,cex.axis=1.25)
lines(lowess(as.numeric(names(avg)),avg),col=2)
legend("bottomright",legend=c("average","lowess smooth"),
lty=c(NA,1),pch=c(21,NA),col=1:2,bty="n",cex=1.25)
par(mfrow=c(1,1))
dev.off()
@
}
}}
{{% Figure: continuous averages
\begin{figure}[h]
\centering
\includegraphics[width=0.7\textwidth,height=0.55\textheight]{figs/ContinuousAverages.pdf}
\caption{N1a amplitude a\-ve\-rages as a function of each one of the predictor
variables, treated as continuous (in contrast with the discretized forms of
these variables shown in Figure \ref{fig:FzMain}). The black circles are the
mean amplitude at each measured level of each variable. The red lines are {\tt
lowess} smooths of the a\-ve\-rages.}
\label{fig:ContAvg}
\end{figure}
}}
We thus fitted a GAMM without assuming linearity. It included a three-way
interaction $Position \times Length \times WMC$ and all lower order
interactions and main effects, with by-subject and by-item
random effects. We allowed the model to use up to
\Sexpr{6+10+10+5*5+5*5+5*5+5*5*5+1} degrees of freedom, but GAMM deemed that 55
degrees of freedom were needed to appropriately model the data. This model was
more likely given the data than the one assuming linearity (model assuming
linearity: \textit{df} = 9, \textit{AIC} = 357382; model not assuming
linearity: \textit{df} = 55; \textit{AIC} = 357228; difference: \textit{df} =
46, \textit{AIC} = 154). Results are provided in Table~\ref{tab:rcsSmry}. The
plot of the three-way interaction is shown in Figure \ref{fig:m2nonlin}.
{{% R code: fitting the models
{\footnotesize
% eval=FALSE
<>=
############################################################
# load data #
############################################################
library(nlEEG)
library(LMERConvenienceFunctions)
data(eegLong)
dat<-eegLong
rm(eegLong)
gc(TRUE,TRUE)
# restrict to electrode Fz and 80--180 ms window
dat <- dat[dat$Electrode == "Fz", ]
dat$Electrode<-as.factor(as.character(dat$Electrode))
dat <- dat[dat$Time >= 80 & dat$Time <= 180, ]
rownames(dat)<-1:nrow(dat)
# mean center things
dat$LengthBc <- dat$LengthB - mean(dat$LengthB)
dat$WMCc <- dat$WMC - mean(dat$WMC)
dat$Position <- dat$Position - mean(dat$Position)
############################################################
# fit model #
############################################################
### GAMM analysis
cl<-makeCluster(8)
m0<-bam(Amplitude~s(Position,k=6)+s(LengthBc)+s(WMCc)+
ti(Position,LengthBc)+ti(Position,WMCc)+ti(LengthBc,WMCc)+
ti(Position,LengthBc,WMCc)+s(Subject,bs="re"),data=dat,
cluster=cl,samfrac=0.1,gc.level=2)
### trim data 2.5 STDs below and above the mean of model residuals
dat<-romr.fnc(m0,dat)$data
# n.removed = 741
# percent.removed = 1.361732
m0<-update(m0)
# add by-item random intercepts
m2<-update(m0,.~.+s(Item,bs="re"))
relLik(m0,m2)
# AIC(x) AIC(y) diff relLik
# 3.576903e+05 3.572280e+05 4.622988e+02 2.437318e+100
### want by-item random intercepts
### looking at random slopes is computationally intractable
### check if want autocorrelation
mytime<-sort(unique(dat$Time))
dat$NewTimeSeries<-FALSE
dat$NewTimeSeries[dat$Time==mytime[1]]<-TRUE
unique(dat[dat$Time==mytime[1],"NewTimeSeries"])
# [1] TRUE
unique(dat[dat$Time!=mytime[1],"NewTimeSeries"])
# [1] FALSE
m3<-update(m2,.~.,rho=0.8,AR.start=dat$NewTimeSeries)
relLik(m2,m3)
# AIC(x) AIC(y) diff relLik
# 3.572280e+05 3.585619e+05 -1.333817e+03 4.312424e+289
### don't want rho
stopCluster(cl)
(smry<-summary(m2))
# Family: gaussian
# Link function: identity
#
# Formula:
# Amplitude ~ s(Position, k = 6) + s(LengthBc) + s(WMCc) + ti(Position,
# LengthBc) + ti(Position, WMCc) + ti(LengthBc, WMCc) + ti(Position,
# LengthBc, WMCc) + s(Subject, bs = "re") + s(Item, bs = "re")
#
# Parametric coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -2.0627 0.1491 -13.83 <2e-16 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Approximate significance of smooth terms:
# edf Ref.df F p-value
# s(Position) 3.681 4.296 27.584 < 2e-16 ***
# s(LengthBc) 1.251 1.328 0.969 0.332287
# s(WMCc) 1.001 1.001 0.234 0.629000
# ti(Position,LengthBc) 9.264 11.324 2.870 0.000798 ***
# ti(Position,WMCc) 12.223 14.393 4.484 2.86e-08 ***
# ti(LengthBc,WMCc) 3.792 5.082 2.774 0.015971 *
# ti(Position,LengthBc,WMCc) 23.765 32.848 1.571 0.019902 *
# s(Subject) 7.671 8.000 23.025 < 2e-16 ***
# s(Item) 269.767 424.000 1.749 < 2e-16 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# R-sq.(adj) = 0.0227 Deviance explained = 2.88%
# fREML score = 1.7873e+05 Scale est. = 45.208 n = 53675
save(smry,file="smry/smry_m2_nonlin.rda")
save(m2,file="models/m2_nonlin.rda")
### check model assumptions and number of knots
png(filename="figs/m2_nonlin.png")
gam.check(m2)
dev.off()
# Method: fREML Optimizer: perf newton
# full convergence after 14 iterations.
# Gradient range [-0.0003367791,0.000375822]
# (score 178732.9 & scale 45.2084).
# Hessian positive definite, eigenvalue range [0.0001485927,26834.18].
#
# Basis dimension (k) checking results. Low p-value (k-index<1) may
# indicate that k is too low, especially if edf is close to k'.
#
# k' edf k-index p-value
# s(Position) 5.000 3.681 1.024 0.94
# s(LengthBc) 9.000 1.251 1.018 0.93
# s(WMCc) 9.000 1.001 0.996 0.43
# ti(Position,LengthBc) 16.000 9.264 1.006 0.66
# ti(Position,WMCc) 16.000 12.223 0.998 0.34
# ti(LengthBc,WMCc) 16.000 3.792 1.017 0.90
# ti(Position,LengthBc,WMCc) 64.000 23.765 0.990 0.24
# s(Subject) 10.000 7.671 NA NA
# s(Item) 426.000 269.767 NA NA
@
}
}}
{{% R code: compare lin vs nonlin
<>=
load("models/m2_lin.rda")
m2a <- m2
load("models/m2_nonlin.rda")
(tmp<-relLik(m2a,m2))
# AIC(x) AIC(y) diff relLik
# 3.573818e+05 3.572280e+05 1.538014e+02 2.497767e+33
AIC(m2a,m2)
# df AIC
# m2a 307.8714 357381.8
# m2 334.4147 357228.0
save(tmp,file="smry/compare_linear_and_nonlinear.rda")
@
}}
{{% R code & Table: results table
{\footnotesize
<>=
load("smry/smry_m2_nonlin.rda")
x<-smry$s.table
x<-x[,c(1,3,4)]
x[,1]<-round(x[,1],0)
x[,2]<-round(x[,2],1)
x[,3]<-round(x[,3],3)
x<-x[1:7,]
rownames(x)<-c("Pos","Lngth","WMC","Pos:Lngth","Pos:WMC","Lngth:WMC","Pos:Lngth:WMC")
x[1,3]<-"< 0.001"
x[1,5]<-"< 0.001"
print(xtable(x, caption = "Electrode Fz -- 80 to 180 ms. Probability (\\textit{p}) values from model for which linearity was \\textbf{not} assumed. \\textit{Pos} = \\textit{Position}. \\textit{Lngth} = \\textit{Length}. \\textit{WMC} = \\textit{Working Memory Capacity}. \\textcolor{red}{\\textit{edf} = estimated degrees of freedom.}", label = "tab:rcsSmry"))
@
}
% latex table generated in R 3.1.0 by xtable 1.7-3 package
% Mon Apr 14 21:19:47 2014
\begin{table}[ht]
\color{red}
\centering
\begin{tabular}{r c d{1} d{3}}
\hline
& edf & F & p-value \\
\hline
Pos & 4 & 27.6 & $<$ 0.001 \\
Lngth & 1 & 1.0 & 0.332 \\
WMC & 1 & 0.2 & 0.629 \\
Pos:Lngth & 9 & 2.9 & 0.001 \\
Pos:WMC & 12 & 4.5 & $<$ 0.001 \\
Lngth:WMC & 4 & 2.8 & 0.016 \\
Pos:Lngth:WMC & 24 & 1.6 & 0.020 \\
\hline
\end{tabular}
\caption{Model for which linearity was \textbf{not} assumed. \textit{Pos} =
\textit{Position}. \textit{Lngth} = \textit{Length}. \textit{WMC} =
\textit{Working Memory Capacity}. \textcolor{red}{\textit{edf} = estimated
degrees of freedom.}}
\label{tab:rcsSmry}
\end{table}
}}
{{% Rcode: plot gamm model
{\footnotesize
<>=
load("models/m2_nonlin.rda")
library(nlEEG)
library(LMERConvenienceFunctions)
data(eegLong)
dat<-eegLong
rm(eegLong)
gc(TRUE,TRUE)
# restrict to electrode Fz and 80--180 ms window
dat <- dat[dat$Electrode == "Fz", ]
dat$Electrode<-as.factor(as.character(dat$Electrode))
dat <- dat[dat$Time >= 80 & dat$Time <= 180, ]
rownames(dat)<-1:nrow(dat)
# mean center things
dat$LengthBc <- dat$LengthB - mean(dat$LengthB)
dat$WMCc <- dat$WMC - mean(dat$WMC)
dat$Position <- dat$Position - mean(dat$Position)
#
zlimit<-c(-11.5,11.5)
#
# create plotting info list
plot.info<-list()
pb<-txtProgressBar(min=1,max=length(unique(m2$model$Position)),
char="=",style=3)
for(i in 1:length(unique(m2$model$Position))){
setTxtProgressBar(pb,i)
plot.info[[i]]<-visgam(m2,view=c("LengthBc","WMCc"),
cond=list(Position=i),col="tim",zlim=zlimit,plot=FALSE,se=2)
}
close(pb)
save(plot.info,file="models/plot.info_m2.rda")
#
# create significance mask list
sig.list<-list()
pb<-txtProgressBar(min=1,max=length(unique(m2$model$Position)),
char="=",style=3)
for(i in 1:length(unique(m2$model$Position))){
setTxtProgressBar(pb,i)
m<-plot.info[[i]]
sig.list[[i]]<-sig2d(m,ref=0,plot.type="n")
}
close(pb)
save(sig.list,file="models/sig.list_se2_m2.rda")
load("models/plot.info_m2.rda")
load("models/sig.list_se2_m2.rda")
library(fields)
n=100
data(vars)
#
# mean center things
vars$LengthBc <- vars$LengthB - mean(vars$LengthB)
vars$WMCc <- vars$WMC - mean(vars$WMC)
vars$Position <- vars$Position - mean(vars$Position)
wmcO<-seq(min(vars$WMC),max(vars$WMC),length.out=n)
wmc<-seq(min(vars$WMCc),max(vars$WMCc),length.out=n)
lengthBO<-seq(min(vars$LengthB),max(vars$LengthB),length.out=n)
lengthB<-seq(min(vars$LengthBc),max(vars$LengthBc),length.out=n)
positionO<-1:6
position<-sort(unique(vars$Position))
#
# plot for each position
pb<-txtProgressBar(min=1,max=length(position),char="=",style=3)
pdf("figs/m2nonlin.pdf")
par(mfrow=c(3,2),mar=c(4.1,4.1,3.1,1.1))
zlimit<-c(-12,12)
for(p in 1:length(position)){
setTxtProgressBar(pb,p)
# get plotting info for position p
m<-plot.info[[p]]
# plot
image.plot(m$xm+mean(vars$LengthB),m$ym+mean(vars$WMC),m$mat,
col=tim.colors(200),main=paste("Position",p),xlab=ifelse(p%in%c(1,4,5),
"Length (# letters)",""),ylab=ifelse(p%in%c(1,3,5),
"WMC (from 0.50 to 0.90)",""),zlim=zlimit,
legend.args=list(text=eval(parse(text = paste("expression(",
paste("Amplitude~~(",expression(mu * V),")",sep=""),")",sep=""))),
line=1.5,cex=0.65))
contour(m$xm+mean(vars$LengthB),m$ym+mean(vars$WMC),m$mat,nlevels=15,
add=TRUE)
# add significance mask
sig<-sig.list[[p]]
image(m$xm+mean(vars$LengthB),m$ym+mean(vars$WMC),sig$sigmat,add=T,
col=rgb(255,255,255,180,maxColorValue=255))
}
par(mfrow=c(1,1),mar=c(5.1,4.1,4.1,2.1))
close(pb)
dev.off()
# plot lower 95% confidence interval
pb<-txtProgressBar(min=1,max=length(position),char="=",style=3)
pdf("figs/m2nonlin_lower95.pdf")
par(mfrow=c(3,2),mar=c(4.1,4.1,3.1,1.1))
zlimit<-c(-30,30)
for(p in 1:length(position)){
setTxtProgressBar(pb,p)
# get plotting info for position p
m<-plot.info[[p]]
# plot
image.plot(m$xm+mean(vars$LengthB),m$ym+mean(vars$WMC),m$llm,
col=tim.colors(200),main=paste("Position",p),xlab=ifelse(p%in%c(1,4,5),
"Length (# letters)",""),ylab=ifelse(p%in%c(1,3,5),
"WMC (from 0.50 to 0.90)",""),zlim=zlimit,
legend.args=list(text=eval(parse(text = paste("expression(",
paste("Amplitude~~(",expression(mu * V),")",sep=""),")",sep=""))),
line=1.5,cex=0.65))
contour(m$xm+mean(vars$LengthB),m$ym+mean(vars$WMC),m$llm,nlevels=15,
add=TRUE)
}
par(mfrow=c(1,1),mar=c(5.1,4.1,4.1,2.1))
close(pb)
dev.off()
# plot upper 95% confidence interval
pb<-txtProgressBar(min=1,max=length(position),char="=",style=3)
pdf("figs/m2nonlin_upper95.pdf")
par(mfrow=c(3,2),mar=c(4.1,4.1,3.1,1.1))
zlimit<-c(-30,30)
for(p in 1:length(position)){
setTxtProgressBar(pb,p)
# get plotting info for position p
m<-plot.info[[p]]
# plot
image.plot(m$xm+mean(vars$LengthB),m$ym+mean(vars$WMC),m$ulm,
col=tim.colors(200),main=paste("Position",p),xlab=ifelse(p%in%c(1,4,5),
"Length (# letters)",""),ylab=ifelse(p%in%c(1,3,5),
"WMC (from 0.50 to 0.90)",""),zlim=zlimit,
legend.args=list(text=eval(parse(text = paste("expression(",
paste("Amplitude~~(",expression(mu * V),")",sep=""),")",sep=""))),
line=1.5,cex=0.65))
contour(m$xm+mean(vars$LengthB),m$ym+mean(vars$WMC),m$ulm,nlevels=15,
add=TRUE)
}
par(mfrow=c(1,1),mar=c(5.1,4.1,4.1,2.1))
close(pb)
dev.off()
@
}
}}
{{% Figure: not assuming linearity
\begin{figure}[h]
\centering
\includegraphics[width=0.8\textwidth,height=0.6\textheight]{figs/m2nonlin.pdf}
\caption{Results of the GAMM analysis were linearity was \textbf{not} assumed.
Each panel shows the {\it Length $\times$ WMC} interaction at each position.
The amplitude of the N1a is shown using the same color coding as in Figure
\ref{fig:topomap141ms}. The scale is provided at the bottom of the figure. The
small numbers on the black lines are isovoltage lines with the voltage in
microvolts provided. Regions shaded in white correspond to portions of the
smooth that included 0 $\mu V$ within the surface's 95\% confidence interval.}
\label{fig:m2nonlin}
\end{figure}
}}
As in the analysis assuming linearity, here we found a significant main effect
of $Position$. This confirmed the pattern observed in both Figures
\ref{fig:FzMain} (discretized variables) and \ref{fig:ContAvg} (continuous
variables) whereby N1a amplitudes were largest for words in the first position
but were more or less consistent across subsequent positions. The lack of
significant main effects for $Length$ or $WMC$ are also consistent with the
linear analysis, suggesting that although the nonlinear curves shown in Figure
\ref{fig:ContAvg} appeared to describe the data more closely than a straight
line would have, nevertheless these relationships were not statistically
reliable.
More striking differences between the linear and nonlinear analyses reside in
the interaction structure of the models. While the linear model
\textcolor{red}{only included a significant $Position \times WMC$ interaction,
the model for which linearity was not assumed contained a significant three-way
interaction $Position \times Length \times WMC$}. Recall that this interaction
was predicted above based on our examination of the plots shown in Figure
\ref{fig:FzInter}. The three-way interaction is depicted in Figure
\ref{fig:m2nonlin}, where the {\it Length $\times$ WMC} interaction is plotted
at each $Position$. \textcolor{red}{Note that regions shaded in white
correspond to portions of a surface where the 95\% confidence interval included
0 $\mu V$. Given the principle of marginality \cite{Venables2000},} we did not
attempt to interpret the two-way interactions or main effects.
\textcolor{red}{Figure \ref{fig:m2nonlin} shows a complex relationship between
$Position$, $Length$, and $WMC$. It is apparent that the amplitude of the N1a
increased with $Position$, and that the strongest N1a occured at $Position$ 6
as can be evidenced by the dark blue areas in the panel for this position.
Moreover, the areas shaded in white indicate that the significance of the N1a
decreased as $Position$ increased, that is, the regions where the 95\%
confidence interval included 0 $\mu V$ became larger and larger: Whereas the
whole surface was essentially significant at positions 1 and 2, the N1a at
positions 4, 5, and 6 was less and less reliable, especially for longer words
and for participants with a $WMC$ score below 0.65. Starting at position 4, the
amplitude of the N1a elicited by 8--12 letter words was essentially 0 $\mu V$
while the N1a to 3 letter words became increasingly unreliable. At position 6,
only 1--2 and 4--8 letter words exhibited a significant N1a, and only for
participants with a $WMC$ score above 0.65.}
In the behavioural analysis, we had found that $Position$, $Length$, and $WMC$
affected the probability of a four-word sequence being correctly recalled.
\textcolor{red}{More specifically, (i) sequences for which the second word was
longer were less likely to be correctly recalled, (ii) more recently presented
four-word sequences were more likely to be correctly recalled, (iii) this
effect was more pronounced for people with higher $WMC$ scores, and (iv) people
with a lower $WMC$ score correctly recalled four-word sequences only when they
were presented in the last three positions. Our ability to relate the ERP
results back to behaviour, however, is seriously compromised by the fact that
we conflated successfully- and unsuccessfully-recalled sequences. In the next
section, we attempt to characterize the relationship between the probability of
recall of a four-word sequence and N1a amplitudes. We additionally investigate
whether the length of the second word of a sequence, the working memory
capacity of a participant, and the position in which a sequence was presented
modulated this relationship.}
}}
{{\subsection{\textcolor{red}{Linking Behaviour and Scalp-recorded Potentials}}
\textcolor{red}{In order to relate the behavioural and ERP results, we first
computed the probability of recall of each four-word sequence for each
participant from the model obtained in the behavioural analysis. We
subsequently fitted a GAMM that included a four-way interaction $Position
\times Length \times WMC \times$ \textit{Probability of Recall} to N1a
amplitudes. Results are provided in Table \ref{tab:PR} as well as in Figure
\ref{fig:ProbRec}.}
{{% R code: fitting the models
{\footnotesize
% eval=FALSE
<>=
############################################################
# load data #
############################################################
library(nlEEG)
library(LMERConvenienceFunctions)
data(eegLong)
dat<-eegLong
rm(eegLong)
gc(TRUE,TRUE)
# restrict to electrode Fz and 80--180 ms window
dat <- dat[dat$Electrode == "Fz", ]
dat$Electrode<-as.factor(as.character(dat$Electrode))
dat <- dat[dat$Time >= 80 & dat$Time <= 180, ]
rownames(dat)<-1:nrow(dat)
# mean center things
dat$LengthBc <- dat$LengthB - mean(dat$LengthB)
dat$WMCc <- dat$WMC - mean(dat$WMC)
dat$Position <- dat$Position - mean(dat$Position)
### get predicted probability of recall for each item and subject
### and add it to data frame
data(vars)
load("models/m2_behav.rda")
vars$ProbabilityRecall<-predict(m2)
dat<-merge(dat,vars[,c("Subject","Item","ProbabilityRecall")],
by=c("Subject","Item"))
############################################################
# fit model #
############################################################
### GAMM analysis
cl<-makeCluster(8)
m0<-bam(Amplitude~
s(Position,k=6)+
s(LengthBc)+
s(WMCc)+
ti(ProbabilityRecall)+
ti(Position,LengthBc)+
ti(Position,WMCc)+
ti(LengthBc,WMCc)+
ti(Position,ProbabilityRecall)+
ti(LengthBc,ProbabilityRecall)+
ti(WMCc,ProbabilityRecall)+
ti(Position,LengthBc,WMCc)+
ti(Position,LengthBc,ProbabilityRecall)+
ti(Position,WMCc,ProbabilityRecall)+
ti(LengthBc,WMCc,ProbabilityRecall)+
ti(Position,LengthBc,WMCc,ProbabilityRecall)+
s(Subject,bs="re")+
s(Item,bs="re"),
data=dat,cluster=cl,samfrac=0.1,gc.level=2)
### trim data 2.5 STDs below and above the mean of model residuals
dat<-romr.fnc(m0,dat)$data
# n.removed = 170
# percent.removed = 0.318358
m0<-update(m0)
options(width=120)
(smry<-summary(m0))
# Family: gaussian
# Link function: identity
#
# Formula:
# Amplitude ~ s(Position, k = 6) + s(LengthBc) + s(WMCc) + ti(ProbabilityRecall) +
# ti(Position, LengthBc) + ti(Position, WMCc) + ti(LengthBc,
# WMCc) + ti(Position, ProbabilityRecall) + ti(LengthBc, ProbabilityRecall) +
# ti(WMCc, ProbabilityRecall) + ti(Position, LengthBc, WMCc) +
# ti(Position, LengthBc, ProbabilityRecall) + ti(Position,
# WMCc, ProbabilityRecall) + ti(LengthBc, WMCc, ProbabilityRecall) +
# ti(Position, LengthBc, WMCc, ProbabilityRecall) + s(Subject,
# bs = "re") + s(Item, bs = "re")
#
# Parametric coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -2.3309 0.1918 -12.15 <2e-16 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Approximate significance of smooth terms:
# edf Ref.df F p-value
# s(Position) 3.585 4.158 8.165 1.07e-06 ***
# s(LengthBc) 1.001 1.001 2.700 0.100306
# s(WMCc) 1.001 1.001 0.157 0.692615
# ti(ProbabilityRecall) 3.322 3.615 3.095 0.018915 *
# ti(Position,LengthBc) 5.275 6.746 3.372 0.001651 **
# ti(Position,WMCc) 10.811 12.720 2.914 0.000356 ***
# ti(LengthBc,WMCc) 1.001 1.002 2.318 0.127815
# ti(Position,ProbabilityRecall) 5.354 6.477 2.467 0.019040 *
# ti(LengthBc,ProbabilityRecall) 1.003 1.005 2.164 0.141055
# ti(WMCc,ProbabilityRecall) 1.430 1.705 1.084 0.316961
# ti(Position,LengthBc,WMCc) 11.516 16.318 0.582 0.902808
# ti(Position,LengthBc,ProbabilityRecall) 3.409 3.999 3.324 0.009936 **
# ti(Position,WMCc,ProbabilityRecall) 11.970 16.344 1.191 0.264156
# ti(LengthBc,WMCc,ProbabilityRecall) 5.916 7.138 2.192 0.030932 *
# ti(Position,LengthBc,WMCc,ProbabilityRecall) 39.556 226.000 0.411 1.09e-09 ***
# s(Subject) 7.618 8.000 23.494 < 2e-16 ***
# s(Item) 280.277 424.000 1.965 < 2e-16 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# R-sq.(adj) = 0.0282 Deviance explained = 3.54%
# fREML score = 1.7596e+05 Scale est. = 42.985 n = 53229
options(width=80)
save(smry,file="smry/smry_m0_nonlin_pr.rda")
save(m0,file="models/m0_nonlin_pr.rda")
stopCluster(cl)
### check model assumptions and number of knots
png(filename="figs/m0_nonlin_pr.png")
gam.check(m0)
dev.off()
# Method: fREML Optimizer: perf newton
# full convergence after 16 iterations.
# Gradient range [-0.000254767,0.0003184496]
# (score 175955.3 & scale 42.98458).
# Hessian positive definite, eigenvalue range [1.814298e-05,26607.75].
#
# Basis dimension (k) checking results. Low p-value (k-index<1) may
# indicate that k is too low, especially if edf is close to k'.
#
# k' edf k-index p-value
# s(Position) 5.000 3.585 0.964 0.00
# s(LengthBc) 9.000 1.001 1.005 0.64
# s(WMCc) 9.000 1.001 1.001 0.50
# ti(ProbabilityRecall) 4.000 3.322 0.960 0.00
# ti(Position,LengthBc) 16.000 5.275 1.001 0.51
# ti(Position,WMCc) 16.000 10.811 0.989 0.20
# ti(LengthBc,WMCc) 16.000 1.001 0.998 0.47
# ti(Position,ProbabilityRecall) 16.000 5.354 0.932 0.00
# ti(LengthBc,ProbabilityRecall) 16.000 1.003 0.951 0.00
# ti(WMCc,ProbabilityRecall) 16.000 1.430 0.955 0.00
# ti(Position,LengthBc,WMCc) 64.000 11.516 0.978 0.07
# ti(Position,LengthBc,ProbabilityRecall) 64.000 3.409 0.951 0.00
# ti(Position,WMCc,ProbabilityRecall) 64.000 11.970 0.957 0.00
# ti(LengthBc,WMCc,ProbabilityRecall) 64.000 5.916 0.951 0.00
# ti(Position,LengthBc,WMCc,ProbabilityRecall) 226.000 39.556 0.962 0.00
# s(Subject) 10.000 7.618 NA NA
# s(Item) 426.000 280.277 NA NA
@
}
}}
{{% R code: nonlinear 4-way results table
{\footnotesize
<>=
load("smry/smry_m0_nonlin_pr.rda")
x<-smry$s.table
x<-x[,c(1,3,4)]
x[,1]<-round(x[,1],0)
x[,2]<-round(x[,2],1)
x[,3]<-round(x[,3],3)
x<-x[1:15,]
rownames(x)<-c("Pos","Lngth","WMC","PrRec","Pos:Lngth","Pos:WMC","Lngth:WMC","Pos:PrRec","Lngth:PrRec","WMC:PrRec","Pos:Lngth:WMC","Pos:Lngth:PrRec","Pos:WMC:PrRec","Lngth:WMC:PrRec","Pos:Lngth:WMC:PrRec")
x[1,3]<-"< 0.001"
x[6,3]<-"< 0.001"
x[15,3]<-"< 0.001"
print(xtable(x, caption = "Electrode Fz -- 80 to 180 ms. Probability (\\textit{p}) values from model for which linearity was \\textbf{not} assumed. \\textit{Pos} = \\textit{Position}. \\textit{Lngth} = \\textit{Length}. \\textit{WMC} = \\textit{Working Memory Capacity}. \\textit{PrRec} = \\textit{Probability of Correct Recall}. \\textcolor{red}{\\textit{edf} = estimated degrees of freedom.}", label = "tab:PR"))
@
}
}}
{{% Table: non-linear results 4-way
% latex table generated in R 3.1.0 by xtable 1.7-3 package
% Thu Apr 17 23:52:51 2014
\begin{table}[ht]
\color{red}
\centering
\begin{tabular}{r c d{1} d{3}}
\hline
& edf & F & p-value \\
\hline
Pos & 4 & 8.2 & $<$ 0.001 \\
Lngth & 1 & 2.7 & 0.100 \\
WMC & 1 & 0.2 & 0.693 \\
PrRec & 3 & 3.1 & 0.019 \\
Pos:Lngth & 5 & 3.4 & 0.002 \\
Pos:WMC & 11 & 2.9 & $<$ 0.001 \\
Lngth:WMC & 1 & 2.3 & 0.128 \\
Pos:PrRec & 5 & 2.5 & 0.019 \\
Lngth:PrRec & 1 & 2.2 & 0.141 \\
WMC:PrRec & 1 & 1.1 & 0.317 \\
Pos:Lngth:WMC & 12 & 0.6 & 0.903 \\
Pos:Lngth:PrRec & 3 & 3.3 & 0.010 \\
Pos:WMC:PrRec & 12 & 1.2 & 0.264 \\
Lngth:WMC:PrRec & 6 & 2.2 & 0.031 \\
Pos:Lngth:WMC:PrRec & 40.0 & 0.4 & $<$ 0.001 \\
\hline
\end{tabular}
\caption{\textcolor{red}{Linking behavioural and ERP data. Probability
(\textit{p}) values from model for which linearity was \textbf{not} assumed.
\textit{Pos} = \textit{Position}. \textit{Lngth} = \textit{Length}.
\textit{WMC} = \textit{Working Memory Capacity}. \textit{PrRec} =
\textit{Probability of Correct Recall}. \textit{edf} = estimated degrees of
freedom.}}
\label{tab:PR}
\end{table}
}}
{{% Rcode: plot nonlin pr gamm model
{\footnotesize
<>=
# create plotting info list
zlimit<-c(-25,25)
len0<-sort(unique(vars$LengthB))
lenc<-sort(unique(vars$LengthBc))
names(lenc)<-len0
len<-lenc[round(seq(1,length(lenc),length.out=5))]
pr<-sort(unique(m0$mode$ProbabilityRecall))
names(pr)<-inv.logit(pr)
pr<-pr[round(seq(1,length(pr),length.out=5))]
plot.info<-list()
pb<-txtProgressBar(min=1,max=length(len)*length(pr),
char="=",style=3)
cnt<-1
for(ii in 1:length(len)){
list.tmp<-list()
for(jj in 1:length(pr)){
setTxtProgressBar(pb,cnt)
list.tmp[[jj]]<-visgam(m0,view=c("WMCc","Position"),
cond=list(LengthBc=len[ii],ProbabilityRecall=pr[jj]),
col="tim",zlim=zlimit,plot=FALSE,se=2)
cnt<-cnt+1
}
plot.info[[ii]]<-list.tmp
}
close(pb)
save(plot.info,file="models/plot.info_m0_pr.rda")
#
# create significance mask list
sig.list<-list()
pb<-txtProgressBar(min=1,max=length(len)*length(pr),
char="=",style=3)
cnt<-1
for(ii in 1:length(len)){
list.tmp<-list()
tmp<-plot.info[[ii]]
for(jj in 1:length(pr)){
setTxtProgressBar(pb,cnt)
m<-tmp[[jj]]
list.tmp[[jj]]<-sig2d(m,ref=0,plot.type="n")
cnt<-cnt+1
}
sig.list[[ii]]<-list.tmp
}
close(pb)
save(sig.list,file="models/sig.list_se2_m0_pr.rda")
#
################################################################
load("models/m0_nonlin_pr.rda")
load("models/plot.info_m0_pr.rda")
load("models/sig.list_se2_m0_pr.rda")
library(nlEEG)
library(fields)
n=100
data(vars)
# mean center things
vars$LengthBc <- vars$LengthB - mean(vars$LengthB)
vars$WMCc <- vars$WMC - mean(vars$WMC)
vars$Position <- vars$Position - mean(vars$Position)
wmcO<-seq(min(vars$WMC),max(vars$WMC),length.out=n)
wmc<-seq(min(vars$WMCc),max(vars$WMCc),length.out=n)
lengthBO<-seq(min(vars$LengthB),max(vars$LengthB),length.out=n)
lengthB<-seq(min(vars$LengthBc),max(vars$LengthBc),length.out=n)
positionO<-1:6
position<-sort(unique(vars$Position))
############################
# plot for each position and each recall probability quantile
library(mgcv)
library(boot)
library(fields)
len0<-sort(unique(vars$LengthB))
lenc<-sort(unique(vars$LengthBc))
names(lenc)<-len0
len<-lenc[round(seq(1,length(lenc),length.out=5))]
pr<-quantile(m0$model$ProbabilityRecall)
pdf("figs/m2nonlin_pr.pdf",width=14,height=14)
pb<-txtProgressBar(min=1,max=length(len)*length(pr),
char="=",style=3)
cnt<-1
zlimit<-c(-25,25)
par(mfrow=c(5,5),mar=c(2.1,3.1,2.1,0.1))
for(jj in 1:length(pr)){
for(ii in 1:length(len)){
setTxtProgressBar(pb,cnt)
# get plotting info
m<-plot.info[[ii]][[jj]]
# plot
if(ii==1 && jj==1){
#par(mar=c(2.1,1.1,2.1,3.1))
image(m$xm+mean(vars$WMC),m$ym+mean(1:6),m$mat,
col=tim.colors(200),main=paste("Lngth = ",
names(len[ii]),"; PrRec = ",round(inv.logit(pr[jj]),2),
sep=""),xlab="",ylab="",zlim=zlimit)
#image.plot(m$xm+mean(vars$WMC),m$ym+mean(1:6),m$mat,
#col=tim.colors(200),main=paste("Lngth = ",names(len[ii]),
#"; PrRec = ",round(inv.logit(pr[jj]),2),sep=""),xlab="",
#ylab="",zlim=zlimit,legend.args=list(text=eval(parse(text=
#paste("expression(", paste("Amplitude~~(",expression(mu * V),
#")",sep=""),")", sep=""))),line=1.5,cex=0.65),legend.width=4)
contour(m$xm+mean(vars$WMC),m$ym+mean(1:6),m$mat,
nlevels=15,add=TRUE)
# add significance mask
sig<-sig.list[[ii]][[jj]]
image(m$xm+mean(vars$WMC),m$ym+mean(1:6),sig$sigmat,add=T,
col=rgb(255,255,255,180,maxColorValue=255))
#par(mar=c(2.1,3.1,2.1,0.1))
}else{
image(m$xm+mean(vars$WMC),m$ym+mean(1:6),m$mat,col=tim.colors(200),
main=paste("Lngth = ",names(len[ii]),"; PrRec = ",
round(inv.logit(pr[jj]),2),sep=""),xlab="",ylab="",zlim=zlimit)
contour(m$xm+mean(vars$WMC),m$ym+mean(1:6),m$mat,
nlevels=15,add=TRUE)
# add significance mask
sig<-sig.list[[ii]][[jj]]
image(m$xm+mean(vars$WMC),m$ym+mean(1:6),sig$sigmat,add=T,
col=rgb(255,255,255,180,maxColorValue=255))
}
cnt<-cnt+1
}
}
par(mfrow=c(1,1),mar=c(5.1,4.1,4.1,2.1))
close(pb)
dev.off()
@
}
}}
{{% Figure 4-way interaction
\begin{figure}[h]
\centering
\hbox{
\hspace{-23.5mm}
\includegraphics[height=0.95\textheight,width=1.25\textwidth]{figs/m2nonlin_pr.pdf}
\caption{Four-way interaction $Position \times Length \times WMC \times
Probability of Correct Recall$. Each panel shows the $WMC \times Position$
model predicted N1a amplitudes for one of five lengths (columns) and
probability of recall values (rows). In each panel, the \textit{x}- and
\textit{y}-axes are \textit{WMC} and \textit{Position}, respectively.
\textit{Lngth} = \textit{Length}. \textit{PrRec} = \textit{ Probability of
Recall}. Details are as for Figure \ref{fig:m2nonlin}.}
\label{fig:ProbRec}
}
\end{figure}
}}
\textcolor{red}{This model was much more likely than the previous one (model
with 3-way interaction: {\it df} = 55, $AIC$ = 357228; model with 4-way
interaction: {\it df} = 106, $AIC$ = 351639; difference: {\it df} = 51, $AIC$ =
5589). The four-way interaction was significant ($F(40, 53229) = 0.4$, $p <
0.001$) thus suggesting that the N1a was not only affected by $Position$,
$Length$, and $WMC$, but was also modulated by \textit{Probability of Recall}.
Results are shown in Figure \ref{fig:ProbRec}. Each row corresponds to one of
the quantiles of the \textit{Probability of Recall} variable (0, 0.16, 0.25,
0.39, and 0.92). The columns represent $Length$ where each column corresponds
to one of five $Length$ values (1, 4, 6, 9, and 12). The most noteworthy
feature in Figure \ref{fig:ProbRec} is the increase of N1a amplitudes with the
probability of correctly recalling a four-word sequence. Specifically, it was
equal to 0 $\mu V$ when the probability of recall was 0, but increased to -15
$\mu V$ at position 1 for longer words in people with higher $WMC$ scores when
the recall probability was 0.92.}
\textcolor{red}{It is known that the N1a is associated with attention, and
that} attended items typically elicit higher N1a amplitudes \cite[and
references cited therein]{Luck2005, Penolazzi2007}. If the N1a indexes the
amount of attentional resources allocated to encoding a sequence,
\textcolor{red}{then it can be concluded that the successful recall of a
four-word sequence} is associated with the allocation of greater amounts of
attentional resources \textcolor{red}{during the encoding phase. Furthermore,
longer sequences presented in the first few positions required greater amounts
of attention in order to be correctly recalled, and people with higher $WMC$
scores were able to allocate more attentional resources to these items than
people with lower $WMC$ scores.}
}}
}}
}}
}}
{{\section{Discussion}
Relaxing the assumption of linearity enabled us to gain a better understanding
of the data by capturing important structure that was overlooked in the model
for which linearity was assumed. Indeed, only a main effect of
\textcolor{red}{$Length$ and a $Position \times WMC$ interaction were}
uncovered in this model whereas the one for which linearity was not assumed
revealed \textcolor{red}{a significant three-way interaction $Position \times
Length \times WMC$}. Moreover, even if these interactions would have reached
significance in the model assuming linearity, the association strength between
N1a amplitudes and these interactions (as indexed by the percentage of deviance
explained\textcolor{red}{, a type of effect size}) would have been much lower
than in the one for which linearity was not assumed, as can be seen in Table
\ref{tab:dev.expl}. \textcolor{red}{In the model for which linearity was not
assumed, the $Position$ main effect and the $Length \times WMC$ interaction
accounted for roughly twice as much deviance as in the one assuming linearity
(i.e., double the effect size). Moreover,} the $Position \times Length \times
WMC$ in the former model accounted for \textcolor{red}{337} times more deviance
than same interaction in the latter model \textcolor{red}{(i.e., this effect
was 337 times greater). It is important to note that the extra amount of
deviance explained by the model for which linearity was not assumed is
\emph{not due to data over-fitting}. Indeed, the effects depicted in Figures
\ref{fig:m2nonlin} and \ref{fig:ProbRec} were obtained by penalized regression
and GCV -- a data-driven approach that strikes a balance between under- and
over-smoothing the data. If the effects should have been a straight line, GAMM
would have penalized them to that extent, as it did in the examples provided in
panels (B) and (D) of Figure \ref{fig:example0}.}
{{% R code: deviance explained
<>=
### linear
cl<-makeCluster(8)
load("models/m2_lin.rda")
dat<-m2$model
myterms<-c("Position","LengthBc","WMCc","Position:LengthBc",
"Position:WMCc","LengthBc:WMCc","Position:LengthBc:WMCc")
pb<-txtProgressBar(min=1,max=length(myterms),style=3)
cnt<-1
dev.expl<-vector("numeric")
for(tt in myterms){
setTxtProgressBar(pb,cnt)
eval(parse(text=paste("m<-update(m2,.~.-",tt,")",sep="")))
dev.expl<-c(dev.expl,summary(m2)$dev.expl-summary(m)$dev.expl)
cnt<-cnt+1
}
close(pb)
dev.expl0<-dev.expl
dev.expl1<-round(dev.expl,6)*100
stopCluster(cl)
### nonlinear
cl<-makeCluster(8)
load("models/m2_nonlin.rda")
dat<-m2$model
myterms<-c("s(Position, k = 6)","s(LengthBc)","s(WMCc)","ti(Position,LengthBc)",
"ti(Position,WMCc)","ti(LengthBc,WMCc)","ti(Position,LengthBc,WMCc)")
pb<-txtProgressBar(min=1,max=length(myterms),style=3)
cnt<-1
dev.expl<-vector("numeric")
for(tt in myterms){
setTxtProgressBar(pb,cnt)
eval(parse(text=paste("m<-update(m2,.~.-",tt,")",sep="")))
dev.expl<-c(dev.expl,summary(m2)$dev.expl-summary(m)$dev.expl)
cnt<-cnt+1
}
close(pb)
dev.expl0<-dev.expl
dev.expl<-round(dev.expl,6)*100
dev.expl.tab<-data.frame(dev.expl1,dev.expl2)
colnames(dev.expl.tab)<-c("Assuming Linearity","Not Assuming Linearity")
dev.expl.tab[2,2]<-0
rownames(dev.expl.tab)<-c("Pos","Lngth","WMC","Pos:Lngth","Pos:WMC",
"Lngth:WMC","Pos:Lngth:WMC")
save(dev.expl.tab,file="smry/dev.expl.tab.rda")
print(xtable(dev.expl.tab,caption="Percentage of the deviance explained by each term of the model assuming linearity and the one not assuming linearity. The interactions in bold face correspond to ones that were significant in the model for which linearity was not assumed. Overall, the model for which linearity was assumed accounted for 2.75\\% of the variance whereas the one for which it was not assumed accounted for 2.88\\% of the variability in the data.",label="tab:dev.expl",digits=6))
@
}}
{{% Table: amount of deviance explained lin vs nonlin
% latex table generated in R 3.1.0 by xtable 1.7-3 package
% Mon Apr 21 13:08:51 2014
\begin{table}[ht]
\color{red}
\centering
\begin{tabular}{r d{6} d{6}}
\hline
& Assuming Linearity & Not Assuming Linearity \\
\hline
\textbf{Pos} & \textbf{0.115400} & \textbf{0.204800} \\
Lngth & 0.000200 & 0.000000 \\
WMC & 0.000100 & 0.000200 \\
\textbf{Pos:Lngth} & \textbf{0.000200} & \textbf{0.073000} \\
\textbf{Pos:WMC} & \textbf{0.003400} & \textbf{0.136400} \\
\textbf{Lngth:WMC} & \textbf{0.017600} & \textbf{0.035100} \\
\textbf{Pos:Lngth:WMC} & \textbf{0.000400} & \textbf{0.134800} \\
\hline
\end{tabular}
\caption{Percentage of the deviance explained by each
\textcolor{red}{fixed-effect} term of the model assuming linearity and the one
not assuming linearity. \textcolor{red}{The terms in} bold face correspond to
ones that were significant in the model for which linearity was not assumed.
\textcolor{red}{Overall, the model for which linearity was assumed accounted
for 2.75\% of the variance whereas the one for which it was not assumed
accounted for 2.88\% of the variability in the data (including random
effects).}}
\label{tab:dev.expl}
\end{table}
}}
By not assuming linearity, we were also able to push our understanding of the
data even further by allowing the $Length \times WMC$ smooth surfaces at each
$Position$ to vary according to \textcolor{red}{the probability of correct
recall of a four-word sequence}. \textcolor{red}{The results obtained from this
model -- shown in Figure \ref{fig:ProbRec} -- were in close agreement with
the continuous averages illustrated in Figure \ref{fig:ContAvg}. More
specifically, the N1a (1) increased as $Length$ increased, (2) was smaller for
people with intermediate $WMC$ scores (for low recall probabilities), and (3)
was greater in the first few positions.} \textcolor{red}{In the remainder of
the discussion, we tackle a few questions that may have arisen while reading
the paper.}
\paragraph{\textcolor{red}{When Can a Continuous Variable be Discretized}}
\textcolor{red}{As mentioned in the Introduction, discretizing an inherently
continuous variable comes at the cost of decreasing statistical power,
increasing the probability of false positives, and increasing the probability
of false negatives \cite{Cohen:1983, McCallum2002}. If for some reason or
another a researcher is interested in dichotomizing a variable, this should be
done on the results of an analysis where continuous variables where used and
linearity was not assumed. A threshold for category membership can then be
derived from the model predicted values.}
\paragraph{\textcolor{red}{Hypothesis Testing and Exploratory Analyses}}
\textcolor{red}{It is often believed that not assuming linearity implies that
one is conducting an exploratory analysis. Although it may be true in some
cases, an analysis in which linearity is not assumed may just as well be based
on {\it a priori} hypotheses. Both types of analyses can be performed with the
same analysis pipeline used here. That is, a hypothesis can be deemed true or
false by virtue of the effect of interest being present or not in the most
likely model.}
\paragraph{\textcolor{red}{Overfitting}}
\textcolor{red}{It is the belief of some researchers that once the assumption of linearity is relaxed, the risk of fitting idiosyncratic features of the data is greatly enhanced. Given that GAMM uses GCV to strike a balance between over- and under-fitting the data, the fitting of idiosyncratic features is unlikely to arise (see section Generalized Cross Validation in supplementary material {\it What's Under the Hood: A Brief Introduction to GAMM} for more details and some examples). An example of this was shown in Figure \ref{fig:example0} (i.e., GAMM did not over-fit the data). The analysis pipeline also includes the viewing of a smooth's 95\% confidence interval, which further protects against over-fitting. Portions of the smooth for which 0 (the baseline) is included in the confidence interval are considered to be flat (however wiggly they may be).}
\paragraph{\textcolor{red}{Sample Size}}
\textcolor{red}{It is sometimes thought that in order to perform mixed-effects
modelling one needs a sufficient number of observations. One should not forget
that a repeated-measures ANOVA is also a mixed-effects model, which is often
used to model small data sets without any problems. The reason we analyse our
data at the single-trial level is not because we need large sample size to
perform mixed-effects modelling, but rather to account for sources of bias that
would otherwise be lost during the averaging process and irreversibly affect
the averages. For example, it affords us the possibility to account for unequal
sample sizes between groups, a situation that commonly arises in ERP data (some
trials and/or channels are usually removed). If we were to aggregate the data
as is traditionally done, it would appear to contain an equal number of data
points in each cell, but in fact the averages would be biased (i.e., they would
have been obtained from unequal sample sizes). Another reason regards the
removal of outlier data points. If the data is un-averaged, data points with
undue influence can be removed based on model residuals, thus making the
results more generalizable. If the data is first averaged, outlier data points
will have irreversibly affected the averages. In our experience, the need to
average the data has only come about for computational tractability.}
\textcolor{red}{Note that an analysis for which linearity is not assumed can
always be used whatever the size of the data (and still find significant
effects). Some examples of this using real data sets with sample sizes between
56 and 180 are provided in file {\it chunk25-smallSampleSizeExamples.R} (which
can be found in the {\tt .Rnw} file or in folder {\it
nlEEG/vignettes/Rcode\_chunks} of supplementary material {\tt nlEEG}).}
{{% Rcode: small sample size examples
<>=
library(lme4)
library(mgcv)
library(gamair)
###################################################
### example 1
###################################################
data(airquality)
# Daily air quality measurements in New York, May
# to September 1973.
dat<-na.omit(airquality)
dim(dat)
# [1] 111 6
m0<-gam(Ozone~s(Solar.R),data=dat)
plot(m0,se=2)
summary(m0)
dat$Month<-as.factor(dat$Month)
m1<-gam(Ozone~s(Solar.R)+s(Month,bs="re"),data=dat)
AIC(m0);AIC(m1);AIC(m0)-AIC(m1)
# [1] 1070.957
# [1] 1054.093
# [1] 16.86342
summary(m1)
# Family: gaussian
# Link function: identity
#
# Formula:
# Ozone ~ s(Solar.R) + s(Month, bs = "re")
#
# Parametric coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 41.574 5.932 7.009 2.47e-10 ***
# ---
# Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
#
# Approximate significance of smooth terms:
# edf Ref.df F p-value
# s(Solar.R) 2.745 3.423 8.088 3.45e-05 ***
# s(Month) 3.111 4.000 5.288 5.73e-05 ***
# ---
# Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
#
# R-sq.(adj) = 0.349 Deviance explained = 38.4%
# GCV score = 768.49 Scale est. = 721.02 n = 111
par(mfrow=c(2,2))
plot(m0,se=2)
plot(m1,se=2)
par(mfrow=c(1,1))
#
###################################################
### example 2
###################################################
data(sleepstudy)
# The average reaction time per day for subjects in a sleep
# deprivation study. On day 0 the subjects had their normal amount
# of sleep. Starting that night they were restricted to 3 hours of
# sleep per night. The observations represent the average reaction
# time on a series of tests given each day to each subject.
dat<-sleepstudy
dim(dat)
# [1] 180 3
m0<-gam(Reaction~s(Days)+s(Subject,bs="re"),data=dat)
summary(m0)
# Family: gaussian
# Link function: identity
#
# Formula:
# Reaction ~ s(Days) + s(Subject, bs = "re")
#
# Parametric coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 298.51 9.05 32.98 <2e-16 ***
# ---
# Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
#
# Approximate significance of smooth terms:
# edf Ref.df F p-value
# s(Days) 1.00 1 169.40 <2e-16 ***
# s(Subject) 15.89 17 14.35 <2e-16 ***
# ---
# Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
#
# R-sq.(adj) = 0.697 Deviance explained = 72.6%
# GCV score = 1066.5 Scale est. = 960.46 n = 180
par(mfrow=c(2,2))
plot(m0,se=2)
par(mfrow=c(1,1))
#
###################################################
### example 3
###################################################
data(cbpp)
# Contagious bovine pleuropneumonia (CBPP) is a major disease of
# cattle in Africa, caused by a mycoplasma. This dataset describes
# the serological incidence of CBPP in zebu cattle during a
# follow-up survey implemented in 15 commercial herds located in the
# Boji district of Ethiopia. The goal of the survey was to study
# the within-herd spread of CBPP in newly infected herds. Blood
# samples were quarterly collected from all animals of these herds
# to determine their CBPP status. These data were used to compute
# the serological incidence of CBPP (new cases occurring during a
# given time period). Some data are missing (lost to follow-up).
dat<-cbpp
dim(dat)
# [1] 56 4
m0<-gam(incidence~period+s(size,by=period)+s(herd,bs="re"),data=dat)
m1<-gam(incidence~period+s(size)+s(herd,bs="re"),data=dat)
AIC(m0);AIC(m1);AIC(m0)-AIC(m1)
# [1] 238.1318
# [1] 241.9248
# [1] -3.793057
# no significant interaction with period
summary(m1)
# Family: gaussian
# Link function: identity
#
# Formula:
# incidence ~ period + s(size) + s(herd, bs = "re")
#
# Parametric coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 3.4406 0.5615 6.127 2.42e-07 ***
# period2 -2.2135 0.7222 -3.065 0.00376 **
# period3 -2.2767 0.7231 -3.149 0.00299 **
# period4 -2.3373 0.8183 -2.856 0.00659 **
# ---
# Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
#
# Approximate significance of smooth terms:
# edf Ref.df F p-value
# s(size) 3.870 4.655 3.956 0.0058 **
# s(herd) 5.308 14.000 0.530 0.1781
# ---
# Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
#
# R-sq.(adj) = 0.495 Deviance explained = 60.7%
# GCV score = 4.5378 Scale est. = 3.47 n = 56
par(mfrow=c(2,2))
plot(m1,select=1,se=2)
abline(h=0,col=2)
plot(m1,select=2,se=2)
par(mfrow=c(1,1))
@
}}
\paragraph{\textcolor{red}{Model Convergence Problems}}
\textcolor{red}{Model convergence problems are most likely to arise when (1)
the researcher is attempting to fit a model with too many predictor variable
relative to the number of data point in the data, (2) there are too many
redundant predictor variables (high collinearity), and/or (3) the random effect
structure is too complex. In the first case, one should not include in the
model more predictor variables than the sample size divided by 15 (e.g., 990
data points / 15 = 66 variables). In the second case, the researcher should
deal with the collinearity present in their data in some way or another
\cite[]{TremblayTucker2011}. In the third case, the researcher
should fit his or her model with simpler and simpler random effect structures
until the model converges. The most important random effects to include in a
model are crossed by-subject and by-item random intercepts, which in our
experience are always warranted, as well as by-subject adjustments for regions
of interest \cite[]{Newman2012}. Note that, in our experience,
models including these random effects always converge. In some analyses of ERP
data, adding by-item random intercepts, and especially by-item random smooths,
results in models that are computationally intractable. This forces us,
unfortunately, to omit them from our models. With smaller sample sizes,
computational tractability becomes less of a problem and one can usually
include these random effects.}
}}
{{\section{Conclusion}
In order to deal with the complexity of ERP data, it is very common to assume
that the relationship between ERP amplitudes/latencies and independent
variables is linear. This assumption does not always hold, however. Although in
some ERP studies the assumption of linearity has been relaxed, these studies
have typically employed the common practice of discretizing continuous
variables, which has been shown elsewhere to be suboptimal.
In this paper, we have demonstrated how nonlinearities in ERP data can be
modeled using \textcolor{red}{generalized mixed-effects modelling} in {\tt R}.
This revealed that the assumption of linearity may lead one to draw incorrect
inferences from his/her ERP data. More specifically, we have illustrated how
the assumption of linearity may (i) increase the probability of false
negatives, (ii) if a relationship is obtained, provide a limited or incorrect
representation of the true relationship between variables, and (iii)
under-estimate the strength of the association between variables.
In spite of the clear potential advantage of nonlinear analyses in capturing
the true structure of the data, the greatest limitation of this technique is
\textcolor{red}{probably the added complexity in interpreting the results}. In
the present case we have demonstrated that this complexity is not necessarily
uninterpretable however, and indeed the results provide a much richer
understanding of the data than could be provided by linear assumptions.
\textcolor{red}{On a closing note, whether or not a researcher predicts that
the relationship between a response variable and an independent variable is
linear, if the independent variable is continuous and has an adequate
distribution we believe that linearity \emph{should never be assumed}. If the
relationship is in fact linear, GAMM will conclude that it is linear (as was
illustrated in Figure \ref{fig:example0}, panel D) and results will be that
much more convincing. If the relationship emerges as non-linear then theory can
be adjusted.}
}}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\bibliography{data}
\end{document}