Final decisions about preprocessing - nicoa47/LDEEG Wiki

New smart tricks:

  1. For high-density EEG, instead of Riemann ASR Euclid ASR will be used due to 144 times faster in computation and according to saccadic artifact reduction assessment, Euclid ASR with 'scf' estimator found to be highly effective to remove artefacts. The reason is that unlike rASR, Euclidian ASR could pass from the requirement of the regularization (You are obliged to use 'lwf' in rASR which is downside but still doing pretty amazing job).
  2. Single file running for script with flag input addition. This hack RAM allocation problem since every single run only one file is preprocessed and kernel dies. All the scripts are run sequentially with fresh RAM allocation.

===== BCI ======

python very_new_preprocessing_applywholedata_v1.2_HPC.py 0; python very_new_preprocessing_applywholedata_v1.2_HPC.py 1;python very_new_preprocessing_applywholedata_v1.2_HPC.py 2;python very_new_preprocessing_applywholedata_v1.2_HPC.py 3;python very_new_preprocessing_applywholedata_v1.2_HPC.py 4;python very_new_preprocessing_applywholedata_v1.2_HPC.py 5

===== BCI ======

===== LD Cueing======

python very_new_preprocessing_applywholedata_v1.2_HPC.py 0; python very_new_preprocessing_applywholedata_v1.2_HPC.py 1

===== LD Cueing ======

===== Lucireta ======

python very_new_preprocessing_applywholedata_v1.2_HPC.py 0; python very_new_preprocessing_applywholedata_v1.2_HPC.py 1;python very_new_preprocessing_applywholedata_v1.2_HPC.py 2 ;python very_new_preprocessing_applywholedata_v1.2_HPC.py 3;python very_new_preprocessing_applywholedata_v1.2_HPC.py 4;python very_new_preprocessing_applywholedata_v1.2_HPC.py 5;python very_new_preprocessing_applywholedata_v1.2_HPC.py 6;python very_new_preprocessing_applywholedata_v1.2_HPC.py 7

===== Lucireta ======


Data selection:

  1. Select whole nights & naps without any cropping. Why? Because; 1.1) Discontinuity of the signal is unacceptable and we generate undefined artefacts in concatenation parts. 1.2) Most of the lucid parts are extremely small and the last REM that has a lucidity part is extremely short in the majority of the subjects. So, we can't extend cropped data for most of the lucid dreaming sessions so it also affects other states and in the end, we do not have enough length of stationary states.

Preprocessing pipeline:

  1. Channel cleaning, channel type assigning, and channel name changing in order to match with 10/20 and/or 10/05 EEG layouts, and remove EOG, EMG, ECG and miscellaneous channels completely.

  2. pyPREP (This module comes first in order to remove line noise with its harmonics, and bad channel detection to interpolate the bad channels & segments which is the prerequisite for advanced artefact removal algorithms e.g. ICA, ASR, rASR, SSP. Also, pyPREP uses only temporary high-pass filtering and no low-pass filtering which re-create unfiltered but cleaned EEG)

  • line-noise selection either 50 or 60 Hz
  • RANSAC = False for 6-channels, True for more than 19-channels
  • montage = 10/05
  • random_state = 31
  1. HP filter (cognitive analyses - use 1 Hz, method=IIR, Butterworth)

  2. Calibration time interval selection (10 minutes of data based on window size of 10 minutes with 2.5 minutes of step size). It selects the window which has the lowest standard deviation.

  3. rASR (We use rASR which is an ASR because it makes local changes without violation of locality. Since the sleep data is non-stationary we can`t use an algorithm that assumes the mental state is stationary and similar in the whole time series. Still, sleep stages change approximately every 15 minutes, and the ones with lucid dreaming have even narrower time segments):

  • 1-second length window size without padding (approximate size of sleep artefacts pushes us to select window size accordingly)
  • Longer window sizes than 1 sec gives an error so window size is selected as 1 second (win_len=1)
  • method = 'riemann'
  • estimator = 'lwf'
  • win_overlap = 0.66
  • cutoff = 2.5 for high-density (above 19-channels), 5 for low-density (6-channels). The higher the cutoff the less aggressive to clean, and if you do not become careful, lower cutoff values will result in extreme deletion of some artefacts which harms the stationarity of the given state.
  1. Robust z-score normalization of each channel individually (it will provide combining data among different devices!)

  2. Downsampling to 100 Hz (common)

  3. Channel selection

  • Select 6-channels (F3, F4, C3, C4, O1, O2) for a bigger sample size
  • Select 64-channels for source estimation
  1. Final check between 36 - 45 Hz to determine contrasts to other frequency bands in terms of detection of microsaccades for each cleaned data (32 in total).
  • Use MNE spectrogram to check strip around 40 Hz.
  • Check for both Early REM, later REM and Lucid
  • Do not give a shit about Wake data (in principle, wake is already well intoxicated and we already know that it gives incredibly significant difference among others).

Robust z-score normalization algorithm:

 def robustZScore(self, raw):
        
        channel_amount = len(raw._data)
        raw_robustzscore = raw.copy()
        
        for i in range(channel_amount):
            MAD = np.median(np.abs(raw._data[i] - np.median(raw._data[i])))
            raw_robustzscore._data[i] = 0.6745 * (raw._data[i] - np.median(raw._data[i])) / MAD
            
        return raw_robustzscore

How to use:

both input and output is MNE raw structure.

raw = robustZScore(raw)