2_what_is_statistics - lotusflyer/hack_2018 GitHub Wiki
What is Statistics?
Statistics is a branch of mathematics dealing with the collection, organization, analysis, interpretation and presentation of data. In applying statistics to, for example, a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model process to be studied. Populations can be diverse topics such as "all people living in a country" or "every atom composing a crystal". Statistics deals with all aspects of data including the planning of data collection in terms of the design of surveys and experiments
Wikipedia
Some Key Points
- Much of statistics works with probabilities and stochastic data (data affected by randomness)
- The word "statistics" refers both to the field of statistics and the numerical quantities it calculates
- Stochastic data is "randomly determined; having a random probability distribution or pattern that may be analyzed statistically but may not be predicted precisely."
- Hence mathematical probability theory provides a foundation for much of statistics
- Statistics often works with "incomplete data" -- that is data that has been sampled from a larger population
- There are two main branches of statistics: descriptive statistics and inferential statistics
- Descriptive statistics is often concerned with summarizing data
- Examples of inferential statistical measures include: mean, median, standard deviation, maximum, minimum, range, scatter plots, correlations
- Inferential statistics draws conclusions from data that subject to random variation
- Examples of inferential statistics are: opinion polling, t-test, anova, linear regression
- A robust and validated statistical model can be used to make justified predictions
- Scientific theories can be tested using inferential statistics
- In my experience, the lack of absolute certainty is challenging - it makes people uncomfortable