Day 1(9 13 2019): Introduction to Data Science - Ajarlin/Data-Science GitHub Wiki
Welcome to the Data-Science wiki!
Def: Data science is the application of computational and statistical techniques to address or gain insights into problems in the real world.
Statistics The science of collecting, classifying, summarizing, organizing, analyzing, and interpreting data.
Linear Algebra The mathematics of matrices and vector spaces
AI The study of computer algorithms dealing with the simulation of intelligent behaviors in order to perform those activities that are normally thought to require intelligence.
Machine Learning The study of computer algorithms to learn in order to improve automatically through experience.
Database The science and technology of collecting, storing, and managing data so users can retrieve, add, updated or remove such data.
Data is NOT machine learning, which involves computation and statistics, but not so much focused on answering real world questions. Mostly the focus on fancy algorithms.
Ml Competitions, data science is not kaggle, goal is not optimizing it for a given data set
statistics evolved more along the math and theoretical froneiter
big data, you dont always need massive amount of data to answer the questions.
Data science in the real world
Interview Questions
- How many minutes worth of videos the average publisher have?
- How many publishers have at least one user who watched their videos?