Week 2 - NicoDeshler/Roots GitHub Wiki

(Aug. 29, Sep 1 2022)

This week we addressed two items:

  1. What is data science?
  2. An introduction to Github as a version-control platform for logging our progress through the R4R program.

There were several qualities to data science that our group identified. Data science generally involves...

  • operating on large datasets
  • computational tools and methods
  • finding patterns in data through different analysis techniques
  • statistics, mathematics, machine learning
  • data visualization

One observation in the discussion that I found particularly interesting was that data science is an applied science rather than a basic science. What does this mean? Well, the principal objective of data science as a discipline is not to make fundamental discoveries about the nature of our world per se. Instead this discipline is concerned with developing new tools and techniques that might enable more sophisticated data analysis. In turn, these tools may enable discoveries within any other discipline that must make sense of large data. An analogous example is the design of a telescope. The act of engineering a telescope is not an investigation into a theory of nature on its own. However, the telescope serves as a tool for making sense of other phenomena that do lend insight into the theories of nature.

Action Items

This week I intend to present the concept of open science to my lab during group meeting. My hope is to encourage the group to take-on creating a Github account for the lab in which students can publish their completed projects. To make their project repositories public, students will have to meet the following standards:

  • all code is well-documented and version controlled
  • a thorough README.md exits that describes how to get the project code running
  • Wiki pages are built out that walk through the background, theory, and results pertaining to the research project
  • links to pre-prints/publications of the associated research paper are readily available

The powerpoint can be downloaded from this repository here.