Reprocessing FS on CFN with v5.3

Motivation:

In moving CONTE data to CFN, the subjects were reprocessed on CFN using FreeSurfer v5.3. This is the same process and version that was run on monstrum.

Scripts Overview:

/data/joy/BBL/projects/conteReproc2017/freesurfer/QA.sh
- also calls:
  - /data/joy/BBL/projects/conteReproc2017/freesurfer/aparc.stats.meanthickness.totalarea.sh
  - /data/joy/BBL/projects/conteReproc2017/freesurfer/cnr_euler_number_calculation.sh
  - /data/joy/BBL/projects/conteReproc2017/freesurfer/flag_outliers.R
  - /data/joy/BBL/projects/conteReproc2017/freesurfer/cnr_euler_qa.R
  - /data/joy/BBL/projects/conteReproc2017/freesurfer/sum_flags_auto_qa.R

Runs the quality assurance pipeline for FreeSurfer 5.3 on CFN and outputs several csv's with subject specific QA data. The code is broken down into seven main sections:

Create subcortical volume segmentation csv - aseg stats
Create mean QA data charts (thickness and surface area charts)
Create parcellation csv's - aparc stats
Create CNR and Euler Numbers csv's
Flag subjects based on whether they are an outlier (>2 SD) on at least one of the following measures:
Mean thickness
Total surface area
Cortical volume
Subcortical gray matter
Cortical White matter
CNR
Euler number
SNR
ROI- Raw cortical thickness
ROI- laterality thickness difference
Flag (gray/csf flag, gray/white flag, euler number flag, number outliers rois thickness flag, total outliers)
Create SNR csv
Create summary of automatic flags csv

Output:

The csv that contains the final QA is:

/data/joy/BBL/studies/conte/subjectData/freesurfer/stats/auto.qa.summary.flags.csv

The final include/exclude decision is based on a sum of the binary flags for the following measures:

meanthickness_outlier
totalarea_outlier
SubCortGrayVol_outlier
CortexVol_outlier
CorticalWhiteMatterVol_outlier
snr_outlier
noutliers.thickness.rois_outlier
noutliers.lat.thickness.rois_outlier
Gray/CSF outlier
Gray/White outlier
Euler number outlier
failed_freesurfer

If there is a 1 in any of the aforementioned flag columns for a subject, the subject should be flagged and excluded from FreeSurfer analyses, which is noted in the "FsFlag" column of the final QA output file.

All output from the QA pipeline:

Aseg stats
- /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/aseg.stats/aseg.stats.volume.csv
Mean thickness and total area
- /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/aparc.stats/bilateral.meanthickness.totalarea.csv
Left hemisphere thickness
- /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/aparc.stats/lh.aparc.stats.thickness.csv
Right hemisphere thickness
- /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/aparc.stats/rh.aparc.stats.thickness.csv
Left hemisphere volume
- /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/aparc.stats/lh.aparc.stats.volume.csv
Right hemisphere volume
- /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/aparc.stats/rh.aparc.stats.volume.csv
Left hemisphere area
- /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/aparc.stats/lh.aparc.stats.area.csv
Right hemisphere area
- /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/aparc.stats/rh.aparc.stats.area.csv
Euler number data
- /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/cnr/euler_number.csv
CNR data
- /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/cnr/cnr_buckner.csv
CNR csv (subject specific)
- /data/joy/BBL/studies/conte/processedData/freesurfer/[bblid]/[datexscanid]/[bblid]_[datexscanid]_cnr.txt
Left hemisphere Euler number csv (subject specific)
- /data/joy/BBL/studies/conte/processedData/freesurfer/[bblid]/[datexscanid]/[bblid]_[datexscanid]_lh_euler.txt
Right hemisphere Euler number csv (subject specific)
- //data/joy/BBL/studies/conte/processedData/freesurfer/[bblid]/[datexscanid]/[bblid]_[datexscanid]_rh_euler.txt
Overall data (mean thickness, cnr, snr, total area, cortical and subcortical volume, and outliers)
- /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/all.flags.n[subjnum].csv
CNR and euler number flags csv
- /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/cnr_euler_flags_n[subjnum].csv
SNR csv
- /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/cnr/snr.txt
Summary csv of automatic QA flags
- /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/auto.qa.summary.flags.csv

QA Pipeline:

Wrapper Script: /data/joy/BBL/projects/conteReproc2017/freesurfer/QA.sh

Requires:

subject list in the form of a text file (slist) (list of bblid/datexscanid)
Example:
bblid/datexscanid
Full path to directory for output csv's to be output to
export output_dir=/data/joy/BBL/studies/conte/subjectData/freesurfer/stats

FreeSurfer specific environment and subject specific variables to be set

  export SUBJECTS_DIR=/data/joy/BBL/studies/conte/processedData/freesurfer  
  export QA_TOOLS=/data/joy/BBL/applications/QAtools_v1.1/  
  export FREESURFER_HOME=/share/apps/freesurfer/5.3.0/  
  export PATH=$FREESURFER_HOME/bin/:$PATH

Create subcortical volume segmentation csv - aseg stats

Script: /data/joy/BBL/projects/conteReproc2017/freesurfer/QA.sh

Steps:

Checks to make sure that the aseg.stats csv does not exist, and if it doesn't then make the appropriate directory (/data/joy/BBL/studies/conte/subjectData/freesurfer/stats/aseg.stats)
```
 if [ ! -e "$output_dir/aseg.stats" ]; then    
   mkdir -p $output_dir/aseg.stats  
 fi    
```
It then runs asegstats2table which takes each subject's subcortical stats file created by recon-all and outputs the data into a table in which each row is a subject and each column is a segmentation volume (in mm3). If a subject is missing their aseg file then that subject is skipped (using the --skip argument to asegstats2table).

asegstats2table --subjectsfile=$slist -t $output_dir/aseg.stats/aseg.stats.volume.csv -m volume --skip
The output table is /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/aseg.stats/aseg.stats.volume.csv

Create mean QA data charts (thickness and surface area charts)

Script: /data/joy/BBL/projects/conteReproc2017/freesurfer/aparc.stats.meanthickness.totalarea.sh

Requires:

Arguments passed by /data/joy/BBL/projects/conteReproc2017/freesurfer/QA.sh
- subject list (slist)
- subject bblid specific directory (SUBJECTS_DIR)
- full path of directory where csv should be output (OUTPUT_DIR)

Steps:

Sets variables based on those passed by wrapper script

 export OUTPUT_DIR=$2      
 export SUBJECTS_DIR=$3    
 slist=$1

Checks to make sure that the aparc.stats csv does not exist, and if it doesn't then make the appropriate directory (/data/joy/BBL/studies/conte/subjectData/freesurfer/stats/aparc.stats)
```
 if [ ! -e $OUTPUT_DIR/aparc.stats ]; then    
     mkdir $OUTPUT_DIR/aparc.stats     
 fi    
```
Creates and outputs headers to the aggregate file /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/aparc.stats/bilateral.meanthickness.totalarea.csv

echo bblid,scanid,rh.meanthickness,rh.totalarea,lh.meanthickness,lh.totalarea > $OUTPUT_DIR/aparc.stats/bilateral.meanthickness.totalarea.csv

Loops through subjects and pulls bblid and scanid from slist

 for i in $(cat $slist); do    
    bblid=$(echo $i | cut -d"/" -f1)    
    scanid=$(echo $i | cut -d"/" -f2)

Checks for the subject's right and left aparc.stats files
If those files don't exist it prints "no rh/lh.aparc.stats file for bblid/datexscanid"

If the files do exist, it greps for "MeanThickness" and "SurfArea" in each subject's aparc stats file created by recon-all

 (Steps 5-7)    
                                           
 ### RH MEAN THICKNESS AND TOTAL AREA ###  
 ########################################  
         subdir=$SUBJECTS_DIR/$i    
         if [ ! -e "$subdir/stats/rh.aparc.stats" ]; then    
           echo "no rh.aparc.stats file for" $i    
           rmt="NA"    
           rta="NA"    
         else    
           string=`grep MeanThickness,  $subdir/stats/rh.aparc.stats`     
           rmt=`echo $string | cut -d "," -f 4`    
           string=`grep SurfArea,  $subdir/stats/rh.aparc.stats`     
           rta=`echo $string | cut -d "," -f 4`    
         fi    
                                                    
 ### LH MEAN THICKNESS AND TOTAL AREA ###    
 ########################################   
         if [ ! -e "$subdir/stats/lh.aparc.stats" ]; then    
           echo "no lh.aparc.stats file for" $i    
           lmt="NA"    
           lta="NA"    
         else    
           string=`grep MeanThickness,  $subdir/stats/lh.aparc.stats`     
           lmt=`echo $string | cut -d "," -f 4`    
           string=`grep SurfArea,  $subdir/stats/lh.aparc.stats`     
           lta=`echo $string | cut -d "," -f 4`    
         fi

It outputs this data into the table /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/aparc.stats/bilateral.meanthickness.totalarea.csv

echo $bblid,$scanid,$rmt,$rta,$lmt,$lta >> $OUTPUT_DIR/aparc.stats/bilateral.meanthickness.totalarea.csv

Create parcellation csv's - aparc stats

Script: /data/joy/BBL/projects/conteReproc2017/freesurfer/QA.sh

Steps:

Run aparcstats2table on each hemisphere/measure combination (left and right hemispheres for volume, thickness, and area). Subject's without aparc stats files are skipped using the --skip flag

Outputs this parcelation data into 6 tables in /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/aparc.stats/ with the naming convention of [hemi lh/rh].aparc.stats.[measurement volume/thickness/area].csv

 (Steps 1-2)    
                                   
 aparcstats2table --hemi lh --subjectsfile=$slist -t $output_dir/aparc.stats/lh.aparc.stats.thickness.csv -m thickness --skip  
 aparcstats2table --hemi rh --subjectsfile=$slist -t $output_dir/aparc.stats/rh.aparc.stats.thickness.csv -m thickness --skip  
 aparcstats2table --hemi lh --subjectsfile=$slist -t $output_dir/aparc.stats/lh.aparc.stats.volume.csv -m volume --skip  
 aparcstats2table --hemi rh --subjectsfile=$slist -t $output_dir/aparc.stats/rh.aparc.stats.volume.csv -m volume --skip  
 aparcstats2table --hemi lh --subjectsfile=$slist -t $output_dir/aparc.stats/lh.aparc.stats.area.csv --skip  
 aparcstats2table --hemi rh --subjectsfile=$slist -t $output_dir/aparc.stats/rh.aparc.stats.area.csv --skip

Create CNR and Euler Numbers csv's

Script: /data/joy/BBL/projects/conteReproc2017/freesurfer/cnr_euler_number_calculation.sh

Requires:

Arguments passed by /data/joy/BBL/projects/conteReproc2017/freesurfer/QA.sh
- subject list (slist) (/data/joy/BBL/studies/conte/subjectData/design3FullSubjectList.txt)
- subject bblid specific directory (SUBJECTS_DIR)
- full path to directory where files should be output (OUTPUT_DIR)
FreeSurfer specific environments be set (FREESURFER_HOME and QA_TOOLS)

Steps:

Sets variables based on those passed by wrapper script

  #set freesurfer specific variables  
  export SUBJECTS_DIR=$2  
  export OUTPUT_DIR=$3  
  export QA_TOOLS=/data/joy/BBL/applications/QAtools_v1.1/  
  export FREESURFER_HOME=/share/apps/freesurfer/5.3.0/  
  export PATH=$FREESURFER_HOME/bin/:$PATH  
             
  #set subject list, directory to output aggregated files to, and the filenames of those aggregate files  
  slist=$1  
  outdir=$OUTPUT_DIR/cnr  
             
  file=$outdir/cnr_buckner.csv  
  euler_file=$outdir/euler_number.csv

Creates and outputs headers to the aggregate files /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/cnr/cnr_buckner.csv and /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/cnr/euler_number.csv
```
  echo "bblid,scanid,total,graycsflh,graycsfrh,graywhitelh,graywhiterh" > $file      
  echo "bblid,scanid,left_euler,right_euler" > $euler_file    
```

Loops through subjects and pulls bblid and scanid from slist and creates variables for the surf and mri directories

 for i in $(cat $slist);do    
   bblid=$(echo $i | cut -d"/" -f1)    
   scanid=$(echo $i | cut -d"/" -f2)    
   echo "working on subject" $i    
   surf=`ls -d $SUBJECTS_DIR/$i/surf`      
   mri=`ls -d $SUBJECTS_DIR/$i/mri`

Runs mri_cnr on the orig.mgz file to calculate cnr and outputs all measures into their subject specific CNR file (/data/joy/BBL/studies/conte/processedData/freesurfer/[bblid]/[datexscanid]/[bblid]_[datexscanid]_cnr.txt)

mri_cnr $surf $mri/orig.mgz > $SUBJECTS_DIR/$i/stats/$bblid"_"$scanid"_cnr.txt"

Create variables for total cnr, gray/csf for left and right hemispheres and gray/white for left and right hemispheres

#create variables for total cnr, gray/csf for left and right hemispheres and gray/white for left and right hemispheres (total and cnr variables are grepping the information from the subject specific file, the variables then need to be cut in order to get just the number)       
                               
total=`grep "total CNR" $SUBJECTS_DIR/$i/stats/$bblid"_"$scanid"_cnr.txt"`    
total2=`echo $total |cut -f 4 -d " "`    
cnr=`grep "gray/white CNR" $SUBJECTS_DIR/$i/stats/$bblid"_"$scanid"_cnr.txt"`    
graycsflh=`echo $cnr | cut -d "," -f 2 | cut -d "=" -f 2 | cut -d " " -f 2`    
graycsfrh=`echo $cnr | cut -d "," -f 3 | cut -d "=" -f 2 | cut -d " " -f 2`    
graywhitelh=`echo $cnr | cut -d "," -f 1 | cut -d "=" -f 2 | cut -d " " -f 2`    
graywhiterh=`echo $cnr | cut -d "," -f 2 | cut -d "=" -f 3 | cut -d " " -f 2`

Outputs CNR data into an appended table (/data/joy/BBL/studies/conte/subjectData/freesurfer/stats/cnr/cnr_buckner.csv)

echo $bblid,$scanid,$total2,$graycsflh,$graycsfrh,$graywhitelh,$graywhiterh >> $file
Runs mris_euler on the lh.orig.nofix and rh.orig.nofix files to calculate euler numbers for left and right hemispheres and outputs all measures into their subject specific euler files (/data/joy/BBL/studies/conte/processedData/freesurfer/[bblid]/[datexscanid]/[bblid]_[datexscanid]lh_euler.txt and /data/joy/BBL/studies/conte/processedData/freesurfer/[bblid]/[datexscanid]/[bblid][datexscanid]_rh_euler.txt)
```
script -c "mris_euler_number $surf/lh.orig.nofix" $SUBJECTS_DIR/$i/stats/$bblid"_"$scanid"_lh_euler.txt"    
script -c "mris_euler_number $surf/rh.orig.nofix" $SUBJECTS_DIR/$i/stats/$bblid"_"$scanid"_rh_euler.txt"    
```

Creates variables for left and right euler numbers by grepping from the lh/rh_euler.txt files

left=`grep ">" $SUBJECTS_DIR/$i/stats/$bblid"_"$scanid"_lh_euler.txt" | cut -d ">" -f 1 | cut -d "=" -f 4 | cut -d " " -f 2`    
right=`grep ">" $SUBJECTS_DIR/$i/stats/$bblid"_"$scanid"_rh_euler.txt" | cut -d ">" -f 1 | cut -d "=" -f 4 | cut -d " " -f 2`

Outputs Euler number data into an appended table (/data/joy/BBL/studies/conte/subjectData/freesurfer/stats/cnr/euler_number.csv)

echo $bblid,$scanid,$left,$right >> $euler_file

Notes:

mris_euler must be run with script -c preceeding it so that the output can be put into a file
The methods for this script were obtained from Chalavi, et al, BMC medical Imaging, 2012. doi:10.1186/1471-2342-12-27. http://www.biomedcentral.com/1471-2342/12/27

Flag subjects based on whether they are an outlier (>2sd) in any of the following fields

Script: /data/joy/BBL/projects/conteReproc2017/freesurfer/flag_outliers.R

Requires:

Arguments passed by /data/joy/BBL/projects/conteReproc2017/freesurfer/QA.sh
- output directory (output_dir)

Steps:

Create directory variables, standard deviation threshold variable, and read in arguments passed by QA.sh script

  ### ARGS ###  
  ############  
  subjects.dir<-commandArgs(TRUE)[1]  
  sdthresh<-2  
              
  ### DIRS ###  
  ############  
  stats.dir<-file.path(subjects.dir)  
  aparc.dir<-file.path(stats.dir, 'aparc.stats')  
  aseg.dir<-file.path(stats.dir, 'aseg.stats')  
  area.dir<-file.path(stats.dir, 'aparc.stats/area')  
  curvature.dir<-file.path(stats.dir, 'aparc.stats/curvature')

Create variables for files

 mean.file<-file.path(aparc.dir, 'bilateral.meanthickness.totalarea.csv')  
 cnr.file<-file.path(stats.dir, 'cnr/cnr_buckner.csv')  
 snr.file<-file.path(stats.dir, 'cnr/snr.txt')  
 aseg.volume.file<-file.path(aseg.dir, 'aseg.stats.volume.csv')  
 lh.thickness.file<-file.path(aparc.dir, 'lh.aparc.stats.thickness.csv')  
 rh.thickness.file<-file.path(aparc.dir, 'rh.aparc.stats.thickness.csv')

Read in mean, cnr, snr, and aseg data and merge into a file called full

mean.data<-read.csv(mean.file, strip.white=TRUE)  
mean.data$meanthickness<-rowMeans(mean.data[, c('rh.meanthickness', 'lh.meanthickness')])  
mean.data$totalarea<-rowSums(mean.data[, c('rh.totalarea', 'lh.totalarea')])  
mean.data<-mean.data[,!(grepl('lh', names(mean.data)) | grepl('rh', names(mean.data)))]  
cnr.data<-read.csv(cnr.file, strip.white=TRUE, header=FALSE)  
full<-mean.data  
full$cnr<- cnr.data$V3[match(full$scanid,cnr.data$V2)]  
full$cnr<- as.numeric(as.character(full$cnr))  
           
snr.data<-try(read.table(snr.file, strip.white=TRUE, header=FALSE, col.names=c('subject', 'snr')))  
if(is.data.frame(snr.data)){  
    snr.data[,c('bblid', 'scanid')]<-apply(do.call(rbind, strsplit(as.character(snr.data$subject), split="/")), 2, as.character)  
	snr.data<-snr.data[,-1]  
    full$snr<- snr.data$snr[match(full$scanid,snr.data$scanid)]  
}  
   
aseg.volume.data<-read.table(aseg.volume.file, strip.white=TRUE, header=TRUE)  
aseg.volume.data[,c('bblid', 'scanid')]<-apply(do.call(rbind, strsplit(as.character(aseg.volume.data$Measure.volume), split="/")), 2, as.character)  
aseg.volume.data<-aseg.volume.data[,c("bblid", "scanid", "SubCortGrayVol", "CortexVol", "CorticalWhiteMatterVol")]  
full$SubCortGrayVol<- aseg.volume.data$SubCortGrayVol[match(full$scanid,aseg.volume.data$scanid)]  
full$CortexVol<- aseg.volume.data$CortexVol[match(full$scanid,aseg.volume.data$scanid)]  
full$CorticalWhiteMatterVol<- aseg.volume.data$CorticalWhiteMatterVol[match(full$scanid,aseg.volume.data$scanid)]

Read in thickness data and merge left and right thickness data into one file

 thickness.data<-read.table(lh.thickness.file, header=TRUE, strip.white=TRUE)  
 rh.thickness.data<-read.table(rh.thickness.file, header=TRUE, strip.white=TRUE)  
 thickness.data[,c('bblid', 'scanid')]<-apply(do.call(rbind, strsplit(as.character(thickness.data$lh.aparc.thickness), split="/")), 2, as.character)  
 rh.thickness.data[,c('bblid', 'scanid')]<-apply(do.call(rbind, strsplit(as.character(rh.thickness.data$rh.aparc.thickness), split="/")), 2, as.character)  
 rh.thickness.data<-rh.thickness.data[,-1]  
 thickness.data<-thickness.data[,-1]  
 thickness.data<-merge(thickness.data, rh.thickness.data, all=TRUE,by=c("scanid","bblid"))  
 rm('rh.thickness.data')

Create data to calculate the standard deviation cut-offs from and get names of left and right hemisphere ROIs

 lh.names<-grep('lh', names(thickness.data), value=TRUE)  
 rh.names<-sub('lh', 'rh', lh.names)  
    
 tmp_lh<-data.frame(matrix(NA, nrow=(nrow(thickness.data)),ncol=length(lh.names)+1))  
 tmp_lh[1]<- thickness.data$scanid  
 colnames(tmp_lh)[1]<-"scanid"  
 colnames(tmp_lh)[2:ncol(tmp_lh)]<- lh.names  
    
 tmp_rh<-data.frame(matrix(NA, nrow=(nrow(thickness.data)),ncol=length(rh.names)+1))  
 tmp_rh[1]<- thickness.data$scanid  
 colnames(tmp_rh)[1]<-"scanid"  
 colnames(tmp_rh)[2:ncol(tmp_rh)]<- rh.names

Calculate the 2SD cut off for each left and right ROI thickness and calculate if each subject in the full dataset is a SD outlier based on the threshold you set (sdthresh)

 for (i in lh.names){  
      
  sd_thresh<-(sdthresh*(sd(subset.data[,i])))  
    
  sd_above_value<- mean(subset.data[,i])+sd_thresh    
  sd_below_value<- mean(subset.data[,i])-sd_thresh    
                                                                            
  tmp_lh[i]<- "0"    
  tmp_lh[i][thickness.data[i]>sd_above_value]<- "1"  
  tmp_lh[i][thickness.data[i]<sd_below_value]<- "1"  
       
 }    
 for (i in rh.names){  
   
  sd_thresh<-(sdthresh*(sd(subset.data[,i])))  
   
  sd_above_value<- mean(subset.data[,i])+sd_thresh  
  sd_below_value<- mean(subset.data[,i])-sd_thresh  
    
  tmp_rh[i]<- "0"  
  tmp_rh[i][thickness.data[i]>sd_above_value]<- "1"  
  tmp_rh[i][thickness.data[i]<sd_below_value]<- "1"  
    
 }  
   
 tmp<- cbind(tmp_lh,tmp_rh[2:ncol(tmp_rh)])  
 tmp2<-data.frame(sapply(tmp[2:ncol(tmp)], function(x) as.numeric(as.character(x))))  
 tmp2<- cbind(tmp[1],tmp2)  
                   
 #get number of thickness ROIs (sum of 1's just calculated for each subject)  
 #count number of outlying regions for each subject  
 thickness.data$noutliers.thickness.rois<-rowSums(tmp2[2:ncol(tmp2)])

Calculate the 2SD cut off for each ROI laterality and calculate if each subject in the full dataset is a SD outlier based on the threshold you set (sdthresh)

  tmp_laterality<-data.frame(matrix(NA, nrow=(nrow(thickness.data)),ncol=length(lh.names)+1))  
  tmp_laterality[1]<- thickness.data$scanid  
  colnames(tmp_laterality)[1]<-"scanid"  
  colnames(tmp_laterality)[2:ncol(tmp_laterality)]<- lh.names  
    
  for (z in seq(1, length(lh.names))){  
     i <- lh.names[z]  
         
     r_name<- paste("rh",substring(i,4,10000),sep="_")  
       
     sd_above_value<-(mean((subset.data[,i] - subset.data[,r_name])/(subset.data[,i] + subset.data[,r_name]))+(sdthresh*(sd((subset.data[,i] - subset.data[,r_name])/(subset.data[,i] + subset.data[,r_name])))))  
     sd_below_value<-(mean((subset.data[,i] - subset.data[,r_name])/(subset.data[,i] + subset.data[,r_name]))-(sdthresh*(sd((subset.data[,i] - subset.data[,r_name])/(subset.data[,i] + subset.data[,r_name])))))  
     
     tmp_laterality[,z+1]<- "0"  
     tmp_laterality[,z+1][which((thickness.data[,i] - thickness.data[,r_name])/(thickness.data[,i] + thickness.data[,r_name])>sd_above_value)]<- "1"  
     tmp_laterality[,z+1][which((thickness.data[,i] - thickness.data[,r_name])/(thickness.data[,i] + thickness.data[,r_name])<sd_below_value)]<- "1"  
     
     tmp_laterality[,z+1]<- as.numeric(tmp_laterality[,z+1])  
     
  }  
                          
   thickness.data$noutliers.lat.thickness.rois<-rowSums(tmp_laterality[2:ncol(tmp_laterality)])

Flag mean data (mean thickness, total surface area, Cortical volume, Subcortical gray matter, Cortical White matter) based on > 2 SD

  thickness.data.mean<-full  
                
  mean_names<- c('meanthickness', 'totalarea', "SubCortGrayVol", "CortexVol", "CorticalWhiteMatterVol", "cnr","snr")  
     
  tmp_mean<-data.frame(matrix(NA, nrow=(nrow(full)),ncol=length(mean_names)+1))  
  tmp_mean[1]<- full$scanid  
  colnames(tmp_mean)[1]<-"scanid"   
  colnames(tmp_mean)[2:ncol(tmp_mean)]<- mean_names  
    
  #then calculate the 2SD cut off for each mean and calculate if each subject in the full dataset is a SD outlier based on the threshold you set (sdthresh)   
       
  for (i in mean_names){  
        
     sd_thresh<-(sdthresh*(sd(thickness.data.mean[,i])))  
         
     sd_above_value<- mean(thickness.data.mean[,i])+sd_thresh  
     sd_below_value<- mean(thickness.data.mean[,i])-sd_thresh  
                 
     tmp_mean[i]<- "0"  
     tmp_mean[i][full[i]>sd_above_value]<- "1"  
     tmp_mean[i][full[i]<sd_below_value]<- "1"  
                 
  }  
  colnames(tmp_mean)[2:ncol(tmp_mean)]<- c(paste(mean_names, 'outlier', sep="_"))

Merge the thickness outliers into the full data file

thickness.data<-thickness.data[,c('bblid', 'scanid', 'noutliers.thickness.rois', 'noutliers.lat.thickness.rois')]  
full$noutliers.thickness.rois<- thickness.data$noutliers.thickness.rois[match(full$scanid,thickness.data$scanid)]  
full$noutliers.lat.thickness.rois<- thickness.data$noutliers.lat.thickness.rois[match(full$scanid,thickness.data$scanid)]  
full$meanthickness_outlier<- tmp_mean$meanthickness_outlier[match(full$scanid,tmp_mean$scanid)]  
full$totalarea_outlier<- tmp_mean$totalarea_outlier[match(full$scanid,tmp_mean$scanid)]  
full$SubCortGrayVol_outlier<- tmp_mean$SubCortGrayVol_outlier[match(full$scanid,tmp_mean$scanid)]  
full$CortexVol_outlier<- tmp_mean$CortexVol_outlier[match(full$scanid,tmp_mean$scanid)]   
full$CorticalWhiteMatterVol_outlier<- tmp_mean$CorticalWhiteMatterVol_outlier[match(full$scanid,tmp_mean$scanid)]  
full$cnr_outlier<- tmp_mean$cnr_outlier[match(full$scanid,tmp_mean$scanid)]  
full$snr_outlier<- tmp_mean$snr_outlier[match(full$scanid,tmp_mean$scanid)]

Get flags

`flags<-names(full)[which(!names(full) %in% c('bblid', 'scanid'))]`

Write the outlier file to /data/joy/BBL/studies/conte/subjectData/freesurfer/stats/all.flags.n[subjnum].csv

 noutliers.flags<-grep('noutlier', names(full), value=T)
 full[,paste(noutliers.flags, 'outlier', sep="_")]<-as.numeric(scale(full[,noutliers.flags])>sdthresh)
 write.csv(full, file.path(stats.dir, paste('all.flags.n' , nrow(full),'.csv', sep='')), quote=FALSE, row.names=FALSE)
 cat('wrote file to', file.path(stats.dir, paste('all.flags.n' , nrow(full),'.csv', sep='')), '\n')

Notes:

For the ROI based measures we compute number of roi outliers for each subject then compute outliers across subjects for number of ROIs flagged.
The measures that are flagged are based on comments here: http://saturn/wiki/index.php/QA

Flag (gray/csf flag, gray/white flag, euler number flag, number outliers rois thickness flag, total outliers)

Script: /data/joy/BBL/projects/conteReproc2017/freesurfer/cnr_euler_qa.R

Requires:

Arguments passed by /data/joy/BBL/projects/conteReproc2017/freesurfer/QA.sh
- full path to directory where csv's should be output (output_dir)

Steps:

Sets variables based on those passed by wrapper script and read in cnr and euler number csv's

 output.dir<-commandArgs(TRUE)[1]  
 cnr_data<- read.csv(paste(output.dir,"/cnr/cnr_buckner.csv",sep=""))  
 euler_data<- read.csv(paste(output.dir,"/cnr/euler_number.csv",sep=""))

Merge the files together into one data file and exclude the subjects that failed freesurfer processing

#merge the files together by datexscanid  
data<- cnr_data  
data$left_euler<- euler_data$left_euler[match(data$scanid,euler_data$scanid)]  
data$right_euler<- euler_data$right_euler[match(data$scanid,euler_data$scanid)]

create a data frame with mean values for gray/csf cnr, gray/white cnr, and euler numbers across hemispheres

#create a dataframe which will get the flags based on euler and cnr calculations
flags<- data

#get mean values for gray/csf cnr, gray/white cnr and euler numbers (average across hemispheres)
flags$mean_euler<-(flags$left_euler+flags$right_euler)/2
flags$mean_graycsf_cnr<- (flags$graycsflh+flags$graycsfrh)/2
flags$mean_graywhite_cnr<- (flags$graywhitelh+flags$graywhiterh)/2

#subset data frame to only IDs and averages
flags<- flags[,c(1,2,10:12)]

create variables to be used as cutoff values for cnr and euler number flagging qa (mean - 2 SD)

graycsf_cutoff<- mean(flags$mean_graycsf_cnr-(2*sd(flags$mean_graycsf_cnr,na.rm=T)),na.rm=T)  
graywhite_cutoff<- mean(flags$mean_graywhite_cnr-(2*sd(flags$mean_graywhite_cnr,na.rm=T)),na.rm=T)  
euler_cutoff<- mean(flags$mean_euler-(2*sd(flags$mean_euler,na.rm=T)),na.rm=T)

flag subjects for each measure based on mean - 2 SD cutoff (binary flagged [1=yes, 0=no] column)

#create a binary flag column (1=yes, 0=no) for average cnr and euler numbers (<2 SD =1, >2 SD=0)
flags$graycsf_flag<- NA
flags$graywhite_flag<- NA
flags$euler_flag<- NA

#remove any subjects with no data
flags<- flags[! is.na(flags$mean_euler),]

for (i in 1:nrow(flags)){
if (flags$mean_graycsf_cnr[i]<= graycsf_cutoff){
flags$graycsf_flag[i]<- 1
} else if (flags$mean_graycsf_cnr[i]>graycsf_cutoff){
flags$graycsf_flag[i]<- 0
}

if (flags$mean_graywhite_cnr[i]<=graywhite_cutoff){
flags$graywhite_flag[i]<- 1
} else if (flags$mean_graywhite_cnr[i]>graywhite_cutoff){
flags$graywhite_flag[i]<- 0
}

if (flags$mean_euler[i]<=euler_cutoff){
flags$euler_flag[i]<- 1
} else if (flags$mean_euler[i]>euler_cutoff){
flags$euler_flag[i]<- 0
}

} # for (i in 1:nrow(flags)){
create a total outliers column in the data (which sums the number of flags across measures for each subject) and a binary flagged column which gets a 1 for a subject if there is at least 1 flagged measure for that subject and a 0 if there were no outliers in any measure for that subject

#subset data frame to only IDs and flags
flags<- flags[,c(1,2,6:8)]

#create a total outliers column which gets the number of total outliers and a column which gets a binary flag 1=yes, 0=no
flags$total_outliers<- NA
x<- cbind(flags$graycsf_flag,flags$graywhite_flag,flags$euler_flag)
flags$total_outliers<- apply(x,1,sum)
flags$flagged<- flags$total_outliers
flags$flagged[flags$flagged>0]<- 1

nsubj<- nrow(flags)
write out flagged csv

write.csv(flags,paste("/data/joy/BBL/studies/conte/subjectData/freesurfer/stats/cnr_euler_flags_n",nsubj,".csv",sep=""))

Create SNR csv

Script: /data/joy/BBL/projects/conteReproc2017/freesurfer/QA.sh

Steps:

loop through subjects (slist) and run QA_TOOLS recon checker, not checking aseg data, output order of files, or logfile status for outliers or errors and without taking snapshots and output this information into a temporary file
```
for i in $(cat $slist); do    
$QA_TOOLS/recon_checker -s $i -nocheck-aseg -nocheck-status -nocheck-outputFOF -no-snaps  
done > temp.txt    
```
grep for subject id (bblid/datexscanid) and output into temp2.txt

grep "wm-anat-snr results" temp.txt | cut -d"(" -f2 | cut -d")" -f1 >temp2.txt

loop through each bblid in temp.txt, and pull the snr value from temp.txt and output to temp3.txt

for i in $(cat -n temp.txt | grep "wm-anat-snr results" | cut -f1); do    
echo $(sed -n "$(echo $i +2 | bc)p" temp.txt | cut -f1)    
done > temp3.txt

append each subject's snr data into the snr.txt file (/data/joy/BBL/studies/conte/subjectData/freesurfer/stats/cnr/snr.txt)

paste temp2.txt temp3.txt > $output_dir/cnr/snr.txt

remove temporary files

rm -f temp*.txt

Notes:

More information on the recon_checker command can be found on the FreeSurfer QA TOOLS wiki https://surfer.nmr.mgh.harvard.edu/fswiki/QATools

Create automatic QA flag summary csv

Script: /data/joy/BBL/projects/conteReproc2017/freesurfer/QA.sh

Steps:

create variables for the automatic qa output of flags and cnr/euler number output

auto_flag=$output_dir/all.flags.n103.csv  
euler_flag=$output_dir/cnr_euler_flags_n102.csv

Script: /data/joy/BBL/projects/conteReproc2017/freesurfer/sum_flags_auto_qa.R

Requires:

Arguments passed by /data/joy/BBL/projects/conteReproc2017/freesurfer/QA.sh
- full path to directory where csv's should be output (output_dir)
- csv of automatic flags from QA (auto_flag)
- csv of atuomatic cnr/euler number flags from QA (euler_flag)

Steps:

Sets variables based on those passed by wrapper script and read in automatic qa csv's

  output.dir<-commandArgs(TRUE)[1]  
  auto_flag<-read.csv(commandArgs(TRUE)[2])  
  euler_flag<-read.csv(commandArgs(TRUE)[3])

Creates data sheet for output of all columns which are included in flagging for review, and merge in the euler and cnr data

  flags<- auto_flag[,c(1:2,12:16,18:20)]  
        
  #merge in the euler number columns (gray/white cnr, gray/csf cnr, euler number)  
  flags$graycsf_flag<- euler_flag$graycsf_flag[match(flags$scanid,euler_flag$scanid)]  
  flags$graywhite_flag<- euler_flag$graywhite_flag[match(flags$scanid,euler_flag$scanid)]  
  flags$euler_flag<- euler_flag$euler_flag[match(flags$scanid,euler_flag$scanid)]

Creates summary column which gets a sum of the flags in the csv (ones used for qa flagging) to get total outliers

flags$total_outliers<- rowSums(flags[,3:13])

Creates a column that gets a binary 1 (flagged) or 0 (not flagged) if the subject is flagged or not, based on if they have one or more total outlier defined above
```
 flags$fsFlag<- "NA"  
 flags$fsFlag[which(flags$total_outliers==0)]<- "0"  
 flags$fsFlag[which(flags$total_outliers>0)]<- "1"  
```

Output the summary qa file

 write.csv(flags, file.path(output.dir, "auto.qa.summary.flags.csv"), quote=FALSE, row.names=FALSE)  
 cat('wrote file to', file.path(output.dir, "auto.qa.summary.flags.csv"), '\n')

FreeSurfer v5.3 QA on CFN - PennBBL/conte GitHub Wiki

Reprocessing FS on CFN with v5.3

Motivation:

Scripts Overview:

Output:

QA Pipeline:

Create subcortical volume segmentation csv - aseg stats

Create mean QA data charts (thickness and surface area charts)

Create parcellation csv's - aparc stats

Create CNR and Euler Numbers csv's

Flag subjects based on whether they are an outlier (>2sd) in any of the following fields

Flag (gray/csf flag, gray/white flag, euler number flag, number outliers rois thickness flag, total outliers)

Create SNR csv

Create automatic QA flag summary csv

⚠️ GitHub.com Fallback ⚠️

FreeSurfer v5.3 QA on CFN - PennBBL/conte GitHub Wiki

Reprocessing FS on CFN with v5.3

Motivation:

Scripts Overview:

Output:

QA Pipeline:

Create subcortical volume segmentation csv - aseg stats

Create mean QA data charts (thickness and surface area charts)

Create parcellation csv's - aparc stats

Create CNR and Euler Numbers csv's

Flag subjects based on whether they are an outlier (>2sd) in any of the following fields

Flag (gray/csf flag, gray/white flag, euler number flag, number outliers rois thickness flag, total outliers)

Create SNR csv

Create automatic QA flag summary csv

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️