Fundamentals Of Probability Distribution - rFronteddu/general_wiki GitHub Wiki
Fundamentals Of Probability Distribution
A distribution is a function that shows the possible values for a variable and how often they occur. In probability theory and statistics, a probability distribution is a mathematical function that, stated in simple terms, can be thought of as providing the probabilities of occurrence of different possible outcomes in an experiment.
A distribution describes the possible values that a variable can take and how frequently they occur.
- We call Y the actual outcome of an event.
- We call y one of the possible outcomes.
- We describe distributions in terms of two characteristics:
- μ (mu/mean): average value
- σ² (sigma squared/variance): how spread out the data is.
- σ (standard deviation), note that σ is defined in the same unit as the mean and we can interpret it better.
Note: One drawback of the variance is that it is measured in square units and usually, there is no direct interpretation of that value. σ is often more informative since it is defined in the same unit as the mean and we can interpret it better.
- we often say “any value between μ-σ and μ-σ falls within one σ from μ”
- The more congested the middle of the distribution the more data falls within that interval (of one standard deviation).
Note: A constant relationship exists between mean and variance:
- $
\sigma^2=E({Y-μ}^2)=E(y^2)-μ^2
$
When we deal with certain known distributions, we can get more precise values.
Probability function
We call the Probability Function P(Y=y) = p(y), a function that describes the probability of a distinct outcome.
Population vs Sample Data
- The population data is all the data
- The Sample data is a part of it ex. The body count of a department is
- the population of a department
- the sample data of a company
When using sample data, we use:
- $
\overline{V}
$ to indicate the mean - $
s^2
$ to indicate the variance - s to indicate the standard deviation
Types of probability distributions
Distributions can be
- Discrete: if they have a finite number of outcomes
- Continuous: if they have infinitely many outcomes
We describe distributions using:
- X ~ N (μ,$
\sigma^2
$)
"Variable X follows a N-typed distribution with mean μ and variance $\sigma^2
$