Non negative matrix factorisation - AAU-Dat/P5-Nonlinear-Dimensionality-Reduction GitHub Wiki

NMF

How it works

NMF is a way to reduce dimmensionality of a matrix. This is achieved by factorising the matrix A and creating 2 smaller matrixes which can aproximate A these two matrixes are name W and H.

$$X \approx WH$$

If the the matrix a is of the size n X m then the new matrixes will be of the size W = n X p and H = p X m. Here p can be what you want based on how acurate you need aproximations to be (traditionally the Rank of the matrix is a good choice). Conceptually W can be seen as the most important building blogs that are a part of A and H can be seen as the combination of these blocs that best aproximate A. How good the aproximation is can be computed using the frobenius norm which can be calculated with truncated Singular Value Decomposition.

How do we find W and H?

This is essentially a optimization problem where we look for the best possible W and H to aproximate. This can in the case of a rank r factorisation be described like this mathmathically.

$$ minimize ||A-WH||_{F}^2 $$

such that W $\geq$ 0 and H $\geq$ 0

This means to minimize the difference between the original matrix and the result of the 2 matrixes that aproximates it. The frobenius norm is what the f stands for and is tool used to tell how alike they are. The actual optimization is often done using the algoritm described below. "Almost all NMF algorithms use a two-block coordinate descent scheme (exact or inexact), that is, they optimize alternatively over one of the two factors, W or H, while keeping the other fixed. The reason is that the subproblem in one factor is convex. More precisely, it is a nonnegative least squares problem (NNLS). Many algorithms exist to solve the NNLS problem; and NMF algorithms based on two-block coordinate descent differ by which NNLS algorithm is used."

This is not the only way to do this its just a common way there is alot of research being done to find the most effective ways to do it and and also to decide which values to give W and H at the beginning.

Whats it used for?

Its generally used for things Like text mining and image processing.

Things to keep in mind

This technicue requires data which is only positive and is relatively computation heavy since theres not 1 optimal answer which can be computed.

sources

https://www.nature.com/articles/44565

https://blog.acolyer.org/2019/02/18/the-why-and-how-of-nonnegative-matrix-factorization/

https://www.geeksforgeeks.org/non-negative-matrix-factorization/

https://towardsdatascience.com/nmf-a-visual-explainer-and-python-implementation-7ecdd73491f8