Using GitHub for research projects - petebachant/petebachant.github.io GitHub Wiki
Pros
-
Track changes to all files: software, notes, etc. (really just a benefit of Git)
-
Issue tracker provides a history of though processes and conversation behind solving problems
-
Easy to create a website for each project by pushing an
index.html
file to thegh-pages
branch -
Can use Git Large File Storage to version large files without taking up a huge amount of disk space
-
README.md
provides a nice "home base" for each project, stating its purpose, how it works, etc. -
Issue tracker can be made into a board using https://waffle.io
-
Pull requests allow others to suggest improvements without needing write access
-
Can experiment with software changes (using branches) without losing any work
-
Issue tracker can be used to automatically document how issues were solved by linking to commits, e.g., by putting issue numbers in commit messages:
git commit -am "Fix bug in calculating C_P; resolves #52"
Cons
- Can't diff/merge binary files, e.g., SolidWorks models, Word documents, Excel spreadsheets
Methods
Dealing with CAD files
These can be kept on some other cloud drive, e.g., Dropbox, and linked in the README. Once Onshape has a few more features, it will be a viable alternative to local CAD software.
Raw data and simulation results
These usually shouldn't be committed to the repo unless they are very small in size. They can be zipped up and put on a cloud drive or Figshare, then the repo can contain a script or function for downloading the raw data.
Experiment repos versus paper repos
Currently, I create a repo for each experiment, and another for each paper about that experiment. Others may like to include everything in one. I haven't done this yet since there have sometimes been multiple papers about one experiment, and including all in there may detract from the experimental repo's purpose for disseminating the data/software.
One technique I have played with a little is adding experiment repos as submodules to paper repos. This is especially helpful if working with multiple experiments. Submodules also track which version of the experiment repo was used for each paper automatically, which is convenient. See https://github.com/petebachant/CFT-Re-dep-paper for an example of submodule usage in a paper repo.
Figures
If figure files are small, there isn't much harm in committing them, though it's probably not the best practice. A better alternative would be to include submodules of the projects used to create the figures, so they can be created automatically.