2_what_is_statistics - lotusflyer/hack_2018 GitHub Wiki

What is Statistics?

Statistics is a branch of mathematics dealing with the collection, organization, analysis, interpretation and presentation of data. In applying statistics to, for example, a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model process to be studied. Populations can be diverse topics such as "all people living in a country" or "every atom composing a crystal". Statistics deals with all aspects of data including the planning of data collection in terms of the design of surveys and experiments

Wikipedia

Some Key Points

  • Much of statistics works with probabilities and stochastic data (data affected by randomness)
  • The word "statistics" refers both to the field of statistics and the numerical quantities it calculates
  • Stochastic data is "randomly determined; having a random probability distribution or pattern that may be analyzed statistically but may not be predicted precisely."
  • Hence mathematical probability theory provides a foundation for much of statistics
  • Statistics often works with "incomplete data" -- that is data that has been sampled from a larger population
  • There are two main branches of statistics: descriptive statistics and inferential statistics
  • Descriptive statistics is often concerned with summarizing data
  • Examples of inferential statistical measures include: mean, median, standard deviation, maximum, minimum, range, scatter plots, correlations
  • Inferential statistics draws conclusions from data that subject to random variation
  • Examples of inferential statistics are: opinion polling, t-test, anova, linear regression
  • A robust and validated statistical model can be used to make justified predictions
  • Scientific theories can be tested using inferential statistics
  • In my experience, the lack of absolute certainty is challenging - it makes people uncomfortable