Links - rmcgranaghan/data_science_tools_and_resources GitHub Wiki

This page contains links to useful resources. *Please note that these lists are in no way exhaustive and are meant to provide a foundation from which to discover the massive amount of material out there and to do so in a digestible and manageable way. Keep checking back here as this is a fluid list and will evolve over time.

Happy exploring!

General

Books

Academic Journals

Learning Communities/Communities of Practice

  • The NASA Center for HelioAnalytics (or contact Ryan McGranaghan) - building a Community of Practice around an informatics/data science approach to Heliophysics science
  • The Earth Science Informatics Partners - one of the foremost leaders in promoting the collection, stewardship and use of Earth science (and related) data, information and knowledge that is responsive to societal needs
  • The AI Learning Salon - a weekly forum to explore bridges and contentions in biological and artificial learning
  • The Royal Institution - youtube series covering a wide range of science topics, including data and intelligence

Tutorials

Online courses

Elements to look for in online courses

  • Focuses on practical skills. Those that are perhaps most wide-reaching include Python and R programming, Jupyter Notebooks, scikit-learn, TensorFlow and Keras, pandas, xarray
  • Provides quick, quality, and consistent feedback
  • Is free or inexpensive (paying for a course that is worthwhile is a good way to get yourself to commit to it!)
  • Is project-oriented
  • Contains an excellent social interaction component

Other tutorial and learning resources

Data Visualization resources

Compilations of resources

Blogs

Podcasts

Ways to become active (i.e., the best way to learn)

  • Start working on open source projects (see links below)
  • Compete in a Kaggle competition
  • Join or start a Meetup and attend or host a Hackathon
  • Collaborate with a data scientist (e.g., find one at your university or work)
  • Reach out to a potential mentor
  • Take an online course
  • Explore your passions in a data-driven manner
  • 'Lurk' - join community email lists or forums to gain exposure to the language before contributing more actively

Open source projects and links

  • Apache Software Foundation: Mission is to provide software for the public good
  • papers with code: Mission is to create a free and open resource with Machine Learning papers, code and evaluation tables
  • Python Scikit-Learn: Mission is to provide free and open machine learning library in the Python programming language
  • Go: Mission is to provide an open-source curriculum for learning Data Science. Foundational in both theory and technologies, the OSDSM breaks down the core competencies necessary to making use of data.

Specific topics

Explainable Artificial Intelligence (XAI)

Frameworks for trustworthy, accountable, explainable systems (AI and other)

AI and ML applied to science

Scientific/research workflows

Open Science

The social component of data science - becoming transdisciplinary

Tools to improve virtual collaboration

Compilations of resources:

Post-it Note-like boards and resources:

Polling resources:

  • XLeap - excellent collective decision-making resource, but expensive
  • Menti
  • Sli.Do

Virtual Conference Tools:

General Interaction Tools:

  • Github - under-rated as a full-stack collaboration tool (even for writing papers and proposals, not just for software)
  • Slack
  • Discord - place for teams to 'hang out' and work
  • QiqoChat - Recommended by the Earth Science Information Partners group; a wrapper around Zoom (but other platforms are possible including Jitsi). Take virtual meetings to the next level and encourage engagement in a variety of ways, not just webinar-style watching
  • Whereby - like Zoom, but in many ways simpler, easier (no downloads or installs, same link each time); free for individual use