Replication kit guide - ganong-noel/lab_manual GitHub Wiki

README.md

  • Describe the purpose of the replication code, and what analyses are/are not included in the replication kit.
    • Specify proprietary data used in the paper and give instructions on how to get access to it. For example, providing contact information of the owner of the data.
  • Give clear instructions on how to run the scripts.
    • Have one driver script that runs all the programs in order
    • Clarify necessary dependencies and packages in a prelim script
  • A detailed description of input, scripts, and output.
    • Input: the content and source of data
    • Scripts and output: the purpose of code and what output it produces, for example, 2_regression_analysis.R produces Table 2 and 3. The line describing every script in README should be the same as the line at the top of each script

Reproducible

Results should reproduce from the simplest building blocks and build up to the final results.

  • Include public data/raw files as much as possible
  • No “magic numbers” (un-referenced, hard-coded numbers).
  • No absolute file paths

Documented

  • The purpose of each script is conveyed by file name.
  • Documentation for different options when running repkit (e.g. options to skip slow parts of the code).

Parsimonious

  • Repkit should not contain code that is not used.

Well-organized

  • There should be a clear file structure. A possible file structure is to separate sub-directories for data input, scripts, and output.

Styles

  • Complies with R / Python style guides.