Fundamentals Of Probability Distribution - rFronteddu/general_wiki GitHub Wiki

Fundamentals Of Probability Distribution

A distribution is a function that shows the possible values for a variable and how often they occur. In probability theory and statistics, a probability distribution is a mathematical function that, stated in simple terms, can be thought of as providing the probabilities of occurrence of different possible outcomes in an experiment.

A distribution describes the possible values that a variable can take and how frequently they occur.

We call Y the actual outcome of an event.
We call y one of the possible outcomes.
We describe distributions in terms of two characteristics:
- μ (mu/mean): average value
- σ² (sigma squared/variance): how spread out the data is.
- σ (standard deviation), note that σ is defined in the same unit as the mean and we can interpret it better.

Note: One drawback of the variance is that it is measured in square units and usually, there is no direct interpretation of that value. σ is often more informative since it is defined in the same unit as the mean and we can interpret it better.

we often say “any value between μ-σ and μ-σ falls within one σ from μ”
The more congested the middle of the distribution the more data falls within that interval (of one standard deviation).

Note: A constant relationship exists between mean and variance:

$\sigma^2=E({Y-μ}^2)=E(y^2)-μ^2$

When we deal with certain known distributions, we can get more precise values.

Probability function

We call the Probability Function P(Y=y) = p(y), a function that describes the probability of a distinct outcome.

Population vs Sample Data

The population data is all the data
The Sample data is a part of it ex. The body count of a department is
the population of a department
the sample data of a company

When using sample data, we use:

$\overline{V}$ to indicate the mean
$s^2$ to indicate the variance
s to indicate the standard deviation

Types of probability distributions

Distributions can be

Discrete: if they have a finite number of outcomes
Continuous: if they have infinitely many outcomes

We describe distributions using:

X ~ N (μ,$\sigma^2$)

"Variable X follows a N-typed distribution with mean μ and variance $\sigma^2$