# Script 2: Extracting the Audio and Decimating the Data

**Python Code to Accompany Douglas, Tremblay, and Newman, "A two for one special: EEG hyperscanning using an existing single-person EEG recording setup"**

---
Copyright (c) 2021 Aaron J Newman & Caitriona L Douglas, NeuroCognitive Imaging Lab, Dalhousie University

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.

---

# Read in raw data, delete audio channel, and save decimated version

In [1]:
import numpy as np
import pandas as pd
import glob, os
import mne
mne.set_log_level(verbose='error')

### Set Parameters 

In [2]:
# "Subject" is the name of your input file, without the extensions
subject = 'conversation_eeg_sample'

# Input should be continuous EEG data 
raw_fname = subject + '.vhdr'

# Filter cutoffs and other parameters
l_freq_use = 0.1
l_freq_ICA = 1.0
h_freq = 20.0

# specify the time window for epoching
tmin = -0.2  # start of each epoch (in sec)
tmax =  1.0  # end of each epoch (in sec)

# maximum number of ICs to reject in ICA artifact correction
ica_random_state = 42  # seed so ICA is reproducable each time it's run
# Specify n_components as a decimal to set % explained variance
n_components = .99

baseline = (None, 0)  # means from the first instant to t = 0
reject = dict(eeg=200e-6, eog=200e-6)  # EEG data are in V, so e-6 gives microVolts

# standard montage file to look up channel locations
montage_fname = 'standard_1005'

## Import raw data file

In [3]:
raw = mne.io.read_raw_brainvision(raw_fname,
                                  eog=('P1_Canthi','P2_Canthi'),
                                  misc=('P1_RN','P1_LMass','P1_LN','P1_RMass',
                                        'P2_RN','P2_LMass','P2_LN','P2_RMass',
                                        'Audio'), 
                                  preload=True)

## Export audio track to .wav file
This step will result in the output of a file containing the audio track that was recorded into the EEG data. This is stared in a WAV file with the same name as the input data file. This WAV file is used to transcribe the conversation and find word onsent timings.

In [5]:
# Extract audio track into separate data structure
Audio = raw.get_data(picks=raw.ch_names.index('Audio'))
# re-format audio for export (16 bit int; time x channels rather than channels x time)
audout = np.int16(Audio/np.max(np.abs(Audio)) * 32767).T

audio_fname= subject + '.wav'

# export using scipy's io
from scipy.io import wavfile
wavfile.write(audio_fname, np.int16(raw.info['sfreq']), audout)

## Decimate EEG to 500 Hz
This make the file mauch smaller and easier to work with than the original 10,000 Hz sampling rate

In [6]:
raw_decim = raw.copy().resample(500, npad='auto')
raw_decim.drop_channels('Audio')

0,1
Measurement date,"March 03, 2017 12:02:56 GMT"
Experimenter,Unknown
Digitized points,Not available
Good channels,"0 magnetometer, 0 gradiometer,  and 56 EEG channels"
Bad channels,
EOG channels,"P1_Canthi, P2_Canthi"
ECG channels,Not available
Sampling frequency,500.00 Hz
Highpass,0.00 Hz
Lowpass,250.00 Hz


In [7]:
raw_decim.save(subject + '_decim-raw.fif', overwrite=True)