6.419x - roadfoodr/mitx-sds-resources GitHub Wiki
6.419x Data Analysis: Statistical Modeling and Computation in Applications
General:
- Navigating Matplotlib by Brandon Rohrer
- 6.419x report template generator by M Powers
Module 1. Review: Statistics, Correlation, Regression, Gradient Descent
Observational Studies and Experiments
- Understanding Statistical Power and Significance Testing: an interactive visualization by Kristoffer Magnusson
- Power of a hypothesis test by melbapplets
- Visualizing Type I vs Type II tradeoffs
Hypothesis Testing
- Interpreting Confidence Intervals: an interactive visualization by Kristoffer Magnusson
Likelihood Ratio Test and Multiple Hypothesis Testing
- False Discovery Rates, FDR, clearly explained by StatQuest with Josh Starmer (Benjamini-Hochberg method)
- The False Discovery Rate: An Overview by Phil Anderson
Correlation and Least Squares Regression
Gradient Descent
- An Interactive Tutorial on Numerical Optimization by Ben Frederickson (scroll down to Gradient Descent section)
Recitation 1: Average Treatment Effect versus Average Treatment Effect for the Treated
Module 2: Genomics and High-Dimensional Data
Visualization of High-Dimensional Data
- Random Vectors
- Everything you did and didn't know about PCA by Alex Williams
- MDS and PCoA by StatQuest with Josh Starmer
- Machine Learning: Multidimensional Scaling by GeostatsGuy Lectures
- How to Use t-SNE Effectively by Martin Wattenberg, Fernanda Viegas, Ian Johnson
Methods of Classification on High-Dimensional Data
Clustering with High-Dimensional Data
Recitation: Demonstration of Data Visualization, Clustering, and Classification
- How to tune hyperparameters of tSNE by Nikolay Oskolkov
- t-SNE: The effect of various perplexity values on the shape
Module 3: Network Analysis
Graph Basics
- Graph Theory: 01. Seven Bridges of Konigsberg by Sarada Herke
- NetworkX Tutorial by Jacob Bank, Evan Rosen
Graph Centrality Measures
- Network Centrality by Systems Innovation
- Degree Centrality by John McCulloch
- Using Centrality Analysis to Fight Crime by Tom Sawyer Software
Spectral Clustering
- The Modern Algorithmic Toolbox Lectures #11: Spectral Graph Theory, Tim Roughgarden & Gregory Valiant
Graphical models
Jupyter Notebook: From Data to Networks, Using Python Networkx
- Graphs and Graph Algorithms (
networkx
tutorial) - Stylish adjacency matrix graphs by Jonathan Chang
Module 4: Time Series
Introduction to Time Series: Trend, Seasonality, Stationarity, Autocovariance
Time Series: Statistical Models
- Time Series: Autoregressive models AR, MA, ARMA, ARIMA by Mingda Zhang
- Let’s Forecast Your Time Series using Classical Approaches by Ajay Tiwari
Introduction to Time Series Analysis 3
- Forecasting: Principles and Practice, Rob J Hyndman and George Athanasopoulos
Module 5: Environmental Data and Gaussian Processes
Environmental Data and Gaussian Processes
Spatial Prediction
- The Kernel Cookbook: Advice on Covariance functions by David Duvenaud