Fundamentals Of Probability Distribution - rFronteddu/general_wiki GitHub Wiki

Fundamentals Of Probability Distribution

A distribution is a function that shows the possible values for a variable and how often they occur. In probability theory and statistics, a probability distribution is a mathematical function that, stated in simple terms, can be thought of as providing the probabilities of occurrence of different possible outcomes in an experiment.

A distribution describes the possible values that a variable can take and how frequently they occur.

  • We call Y the actual outcome of an event.
  • We call y one of the possible outcomes.
  • We describe distributions in terms of two characteristics:
    • μ (mu/mean): average value
    • σ² (sigma squared/variance): how spread out the data is.
    • σ (standard deviation), note that σ is defined in the same unit as the mean and we can interpret it better.

Note: One drawback of the variance is that it is measured in square units and usually, there is no direct interpretation of that value. σ is often more informative since it is defined in the same unit as the mean and we can interpret it better.

  • we often say “any value between μ-σ and μ-σ falls within one σ from μ”
  • The more congested the middle of the distribution the more data falls within that interval (of one standard deviation).

Note: A constant relationship exists between mean and variance:

  • $\sigma^2=E({Y-μ}^2)=E(y^2)-μ^2$

When we deal with certain known distributions, we can get more precise values.

Probability function

We call the Probability Function P(Y=y) = p(y), a function that describes the probability of a distinct outcome.

Population vs Sample Data

  • The population data is all the data
  • The Sample data is a part of it ex. The body count of a department is
  • the population of a department
  • the sample data of a company

When using sample data, we use:

  • $\overline{V}$ to indicate the mean
  • $s^2$ to indicate the variance
  • s to indicate the standard deviation

Types of probability distributions

Distributions can be

  • Discrete: if they have a finite number of outcomes
  • Continuous: if they have infinitely many outcomes

We describe distributions using:

  • X ~ N (μ,$\sigma^2$)

"Variable X follows a N-typed distribution with mean μ and variance $\sigma^2$