EM (Expectation Maximization) - SoojungHong/TextMining GitHub Wiki
EM algorithm
In statistics, the EM algorithm iterates and optimizes the likelihood of seeing observed data while estimating the parameters of a statistical model with unobserved variables.
Why similar with clustering
By optimizing the likelihood, EM generates a model that assigns class labels to data points — sounds like clustering to me!
Actually EM algorithm is one of known clustering algorithm.
Well known Clustering algorithms
k-Means
Hierarchical Cluster Analysis (HCA)
Expectation Maximization
EM algorithm steps
EM begins by making a guess at the model parameters.
Then it follows an iterative 3-step process:
- E-step: Based on the model parameters, it calculates the probabilities for assignments of each data point to a cluster.
- M-step: Update the model parameters based on the cluster assignments from the E-step.
- Repeat until the model parameters and cluster assignments stabilize (a.k.a. convergence).