New Book: Applied ERP Data Analysis

I’m excited to announce my new book, Applied ERP Data Analysis. It’s available online FOR FREE on the LibreTexts open source textbook platform. You can cite it as: Luck, S. J. (2022). Applied Event-Related Potential Data Analysis. LibreTexts. https://doi.org/10.18115/D5QG92

The book is designed to be read online, but LibreTexts has a tool for creating a PDF. You can then print the PDF if you prefer to read on paper.

I’ve aimed the book at beginning and intermediate ERP researchers. I assume that you already know the basic concepts behind ERPs, which you can learn from my free online Intro to ERPs course (which takes 3-4 hours to complete).

Whereas my previous book focuses on conceptual issues, the new book focuses on how to implement these concepts with real data. Most of the book consists of exercises in which you process data from the ERP CORE, a set of six ERP paradigms that yield seven different components (P3b, N400, MMN, N2pc, N170, ERN, LRP). Learn by doing!

With real data, you must deal with all kinds of weird problems and make many decisions. The book will teach you principled approaches to solving these problems and making optimal decisions.

Side note: my approach in this book was inspired by Mike X Cohen’s excellent book, Analyzing Neural Time Series Data: Theory and Practice.

You will analyze the data using EEGLAB and ERPLAB, which are free open source Matlab toolboxes. Make sure to download version 9 of ERPLAB. (You may need to buy Matlab, but many institutions provide free or discounted licenses for students.) Although you will learn a lot about these specific software packages, the exercises and accompanying text are designed to teach broader concepts that will translate to any software package (and any ERP paradigm). The logic is much more important than the software!

One key element of the approach, however, is currently ERPLAB-specific. Specifically, the book frequently asks whether a given choice increases or decreases the data quality of the averaged ERPs, as quantified with the Standardized Measurement Error (SME). If this approach makes sense to you, but you prefer a different analysis package, you should encourage the developers of that package to implement SME. All our code is open source, so translating it to a different package should be straightforward. If enough people ask, they will listen!

The book also contains a chapter on scripting, plus tons of example scripts. You don’t have to write scripts for the other chapters. But learning some simple scripting will make you more productive and increase the quality, innovation, and reproducibility of your research.

I made the book free and open source so that I could give something back to the ERP community, which has given me so much over the years. But I’ve discovered two downsides to making the book free. First, there was no copy editor, so there are probably tons of typos and other errors. Please shoot me an email if you find an error. (But I can’t realistically provide tech support if you have trouble with the software.) Second, there is no marketing budget, so please spread the word to friends, colleagues, students, and billionaire philanthropists.

This book was also designed for use in undergrad and grad courses. The LibreTexts platform makes it easy for you to create a customized version of the book. You can reorder or delete sections or whole chapters. And you can add new sections or edit any of the existing text. It’s published with a CC-BY license, so you can do anything you want with it as long as you provide an attribution to the original source. And if you don’t like some of the recommendations I make in the book, you can just change it to say whatever you like! For example, you can add a chapter titled “Why Steve Luck is wrong about filtering.”

If you are a PI: the combination of the online course, this book, and the resources provided by PURSUE give you a great way to get new students started in the lab. I’m hoping this makes it easier for faculty to get more undergrads involved in ERP research. 

Hints for ICA-based artifact correction

ICA is a great tool for correcting artifacts, especially eye blinks. Here we provide some practical hints for ICA-based artifact correction.

One of the most difficult parts for new users is knowing which independent components (ICs) reflect artifacts and should be removed.  There's a really great overview here, which describes how to use the combination of scalp distribution, time course, and frequency content to distinguish among artifacts.

A fundamental assumption of ICA is that the artifact has a perfectly consistent scalp distribution.  This is true for some artifacts, such as blinks and EKG, but it may not be true for others (e.g., EMG, eye movements).  ICA works reasonably well with EMG and eye movements under some conditions (especially if all the eye movements are along a single plane), but we recommend caution with these artifacts.

A key aspect of ICA is that the number of ICs must necessarily be equal to the number of channels. You obviously aren't changing the number of underlying brain components by changing the number of electrodes, so the fact that the number of ICs changes as you vary the number of electrodes should make it clear that ICA is an imperfect approach that will not work for every kind of artifact.  Also, no matter how many electrodes you have, the number of brain components is likely to be greater than the number of ICs.  So, ICA will inevitably blend multiple brain components into a single IC, and a single brain component may be split across multiple ICs. As a result, ICA cannot be expected to figure out all the true underlying components.  In practice, ICA works best for components that are relatively large and relatively common.  For this reason, our labs mainly use ICA for blinks, which are both very large (in all participants) and fairly common (in most participants). 

Given that the number of ICs is fixed, you don't want to "waste" ICs on huge but infrequent artifacts (especially if you have a relatively small number of channels).  For example, your participants may have periods of "crazy" EEG during breaks (as a result of stretching, movement, etc.), and these periods may eat up a lot of ICs.  You may therefore want to delete sections of "crazy data" before performing ICA.  ERPLAB Toolbox has two routines that are designed to help with this.  One can automatically delete periods of data between trial blocks.  Another can detect and delete periods of crazy data.  But don't use this approach to delete ordinary artifacts -- this should be for periods in which you are seeing huge and irregular voltage deflections.

Similarly, it's important to eliminate slow drifts prior to ICA.  We generally recommend a high-pass filter with a half-amplitude cutoff of 0.1 Hz and a slope of 12 dB/octave.  You can also apply a low-pass filter if your data have a lot of high-frequency noise.  This filtering should be done on the continuous data (prior to epoching and prior to deleting segments of crazy data) and should be consistent for all participants (e.g., don't low-pass filter for some participants but not others, or use different cutoffs for different participants). In theory, you can calculate the ICA weights on heavily filtered data (e.g., high-pass cutoff at 1 Hz) and then apply the weights to less filtered data, but this is not guaranteed to work well.

Another important bit of practical advice is that ICA involves training a neural network, and you need enough "trials" (time points) to train the network. A general heuristic is that the # of time points must be greater than 20 x (# channels)^2.  The key is that the number of channels is squared.  So, with 64 channels, you would need 81,920 points (which would be about 5.5 minutes of data with a 250 Hz sampling rate).  However, with 128 channels, you would need 4 times as many points, and with 256 channels, you would need 16 times as many points.

For additional, very specific advice, see Makoto's preprocessing pipeline.