Week 1 - NicoDeshler/Roots GitHub Wiki

(Aug. 23,25 2022)

The meetings this week had two driving questions:

  1. What is Open Science?
  2. What is Reproducible Science?

Open science is a research paradigm that places high-value on clarity of presentation, accessibility, and reproducibility. Today, almost all physical sciences employ computation for data processing, measurement collection, optimization, etc. In this frame, open science means several things:

  • Proper code documentation so that readers can understand what the scripts are doing (allowing them to potentially optimize the work or make new discoveries).
  • Data organization and availability. Properly setting up folder/file structures for datasets and thoroughly documenting all files with meta-data.
  • Effective written communication. Often if code is involved, it is favorable to use markdown languages which integrate the conceptual components of the research with the computational components that are used to arrive at the results presented.
  • Version controlled software and publication
  • Establishing lab conventions

The aim of open science is in part to improve the reproducibility of published work by championing transparency and accessibility. One epistemological question that I had is whether it is scientific results or scientific methods that should be reproducible? Implicitly, open science maintains that the methods should be reproducible - this is justifiable. In certain disciplines, adhering to a prescribed scientific method will generally produce consistent results. For example, in physics, results are generally reproducible (barring the obvious caveat of quantum mechanics which notoriously has probabilistic measurement outcomes) so long as the experimental methods are consistent. In contrast, equivalent experiments in climate science may deliver different results depending on when they are conducted. In both cases, the only feature of science which we can guarantee to be reproducible is the method. Thus it merits careful documentation.

Here are a few websites I found that I think are exemplary for illustrating open science:

Software tools designed to integrate different aspects of the scientific process (e.g. data collection/storage, data processing, writing, etc.) have grown alongside the rise of large data sets and processing techniques. Some examples include Quarto and JupyterNotebook. Some of the skills I hope to develop in order to uphold the principles of open science are:

  • Identifying the proper software tool that is most effective for a particular set of tasks
  • Building proficiency in a few of these integration tools
  • Improving data visualization methods and data presentation

My starting point on the path towards practicing open science effectively: