Home - Ajarlin/Data-Science GitHub Wiki

Welcome to the Data-Science wiki!

Course Code: 01:198:439

This course covers topics needed to solve problems involving data, which includes preparation (collection and integration), characterization and presentation (information visualization), analysis (machine learning and data mining), and products (applications).

Topics:

  • Data visualization

  • Data wrangling and pre-processing

  • Map-reduce and the new software stack

  • Data mining: finding similar items, mining data streams, frequent itemsets, link analysis, mining graph data

  • Machine learning: k nearest neighbor, decision trees, naive Bayes, regression, ensemble methods, support vector machines, k-means, spectral clustering, hierarchical clustering, dimensionality reduction, evaluation techniques

  • Applications: recommendation systems, advertising on the Web