Day 1(9 13 2019): Introduction to Data Science - Ajarlin/Data-Science GitHub Wiki

Welcome to the Data-Science wiki!

Def: Data science is the application of computational and statistical techniques to address or gain insights into problems in the real world.

Statistics The science of collecting, classifying, summarizing, organizing, analyzing, and interpreting data.

Linear Algebra The mathematics of matrices and vector spaces

AI The study of computer algorithms dealing with the simulation of intelligent behaviors in order to perform those activities that are normally thought to require intelligence.

Machine Learning The study of computer algorithms to learn in order to improve automatically through experience.

Database The science and technology of collecting, storing, and managing data so users can retrieve, add, updated or remove such data.

Data is NOT machine learning, which involves computation and statistics, but not so much focused on answering real world questions. Mostly the focus on fancy algorithms.

Ml Competitions, data science is not kaggle, goal is not optimizing it for a given data set

statistics evolved more along the math and theoretical froneiter

big data, you dont always need massive amount of data to answer the questions.

Data science in the real world

Interview Questions

  1. How many minutes worth of videos the average publisher have?
  2. How many publishers have at least one user who watched their videos?