Aylin's Monster (OLD) - PennLINC/Reward GitHub Wiki

Documentation from 05/03/2016

How to Monster

Full In-Depth Instructions is on GitHub BBL – reward analysis scripts and is titled “pulling subject data from Selkie”

Before You can run the Script:

  • Need access to selkie redcap and the project “Wolf Satterthwaite Repository”
  • Need to request token for project and create a ~.redcap.cfg folder in user folder that contains this token which allows for API access

How to run Date-Matching Script:

The Call:

/import/monstrum/Applications/R3.1.2/bin/R --file=/import/monstrum/Users/adaldal/collapse_redcap_data_across_forms_v3.4.R --slave --args "Wolf Satterthwaite Repository" "1500" "/import/monstrum/Users/adaldal/effort_350.csv" "/import/monstrum/Users/adaldal/dwitem_7-29-15.csv" .8 "studyenroll" "diagnosis" "scanid" "demographics" "grit" "gritsummary" "bdiold" "bdi" "bdisummary" "cdss" "cdsssummary"

The Arguments: The Script:

/import/monstrum/Applications/R3.1.2/bin/R --file=/import/monstrum/Users/adaldal/collapse_redcap_data_across_forms_v3.2.R --slave --args "Wolf Satterthwaite Repository"

Input file: provide path to input file, must be a csv that contains bblids and dates in yyyymmdd format /import/monstrum/Users/adaldal/effort_350.csv

Output file: provide path to output file, must be a csv /import/monstrum/Users/adaldal/dwitem_7-29-15.csv

Date Range: "1500" (in number of days) allowed between the two dates (date provided and date of measure it is matching on)

Proportion of NAs: .8 (ie. accepts data with 20% or less NAs) Be careful with this because occasionally it will exclude subject data and output NAs, for example subjects with floor prt (missing reaction times) will get excluded at a .8 value because about 25% of their data is entered as “missing” NAs.

Measures: the measures that you are trying to pull from the project, "studyenroll" "diagnosis" "scanid" "demographics" "grit" "gritsummary" "bdiold" "bdi" "bdisummary" "cdss" "cdsssummary" Item level and summary scores live in this project across all studies (Day2, FNDM, FNDM2, Effort and Nodra) so make sure to spell things correctly otherwise it will not be able to pull. Make sure you are referencing the measure by the procedure name defined in the project (ex. CAINS summary = “newfranksummary” in the big project).

  • studyenroll, scanids, diagnosis, occupation and medication are their own procedures not part of demographics procedure

Common Errors: Ex. [1] "Procedure 'MoCoSeries' does not exist in data dictionary." [1] "Procedure '' does not exist in data dictionary." Error: Procedures and forms unmatched. Please check your Redcap data. Execution halted The problem: you tried to reference a procedure that does not exist in the project The solution: 1) make sure to check your spelling of every procedure you are trying to pull, and double check that that is what the procedure name is in the large project. 2) make sure that procedure exists in the project by checking the data dictionary or the list of procedures on the left of the project home page 3) make sure that any data you uploaded was linked to a procedure by creating participant_id, bblid, procedure, and dov columns for each spreadsheet 4) CANNOT run script without including “demographics” as a procedure. This is because dob, dovisit etc. are calculated using values that only exist in demographics. Every other measure is optional and can be added, removed etc.