How to get started - rmcgranaghan/data_science_tools_and_resources GitHub Wiki
This page is for people who are beginning to learn data science and machine learning, and are trying to determine the best resources.
- datacamp.com is an online learning resource with many gentle introductions to machine learning topics.
- kdnuggets.com is a blog with many thoughtful posts and recommendations in data science. For example, here is a description of "cold start" approaches (i.e. data are not labeled, so you want to understand properties of your data based on what the machine determines rather than user-determined) https://www.kdnuggets.com/2019/01/data-scientist-dilemma-cold-start-machine-learning.html
- Training and learning on a standardized data set is essential for the beginner! The MNIST handwritten character dataset is very useful for building skills and intuition. Try the exercises in this reference: https://towardsdatascience.com/image-classification-in-10-minutes-with-mnist-dataset-54c35b77a38d
- If you already know python, the last chapter of Jake VanderPlas's Python Data Science book (written in python/Jupyter) is a scikit.learn tutorial with many machine learning examples (SVM, K-means, naive Bayes). A great way to just see how the methods work, understand where they're relevant, and break out of "black box" right away (because it explains how they work). The rest of the book has a lot of introductory materials intended to turn you into an expert quickly.
- Lots of links and resources (online books, tutorials) at https://sites.google.com/view/helioanalytics/resources
- Statistics & Visualization in Python for Climate / Space. Contains upper level course labs from U. Michigan to learn basics of Python data analysis and visualization skillsets. Assumption for course students is to start with little to no programming experience.
- CS 229 at Stanford course materials for Computer Science 229 Machine Learning at Stanford, solid introduction to basic core concepts on common algorithms. This is an online course derived from CS229 (same principles - there are quizzes and graded homework exercises, but less rigorous) https://www.coursera.org/learn/machine-learning
Future additions
- Separate based on type of resource (online course, book, tutorial, blog/discussion)
- Different instructions based on the goals of the learner? For example, some may want more fundamental principles and theory, some may want to know how to design code, while others may plunge in with demos and tutorials.