Unsupervised vs. supervised learning - sagr4019/ResearchProject GitHub Wiki
Unsupervised vs. supervised learning
General
Both, supervised and unsupervised learning, are part of machine learning.
Supervised learning
Supervised learning is a method for machine learning, which uses labeled data (i.e. For every entry in our dataset there is an input and an output value). The advantage of the supervised learning is, that the machine learns a model with our dataset and a training. Typical tasks we can solve with supervised learning are "classification"- and "regression"-problems.
Regression
Meant is that there are properties that describes data in a dataset and influences the output value. E.g. the value of a share or houseprices in a certain area. Another example is to get informations about a person and get the age (can walk -> older than x...). Here we can use "linear regression".
Classification
In classification we want to sort our data in given categories. For example we want to classificate an e-mail as Spam or not, with some describing variables. For classification we can use Logistic regression.
Unsupervised learning
Unsupervised learning uses "unlabeled" data. The machine tries to find pattern in the input values.
Cluster analysis
Cluster analysis means to group the data according to similarities.
As an example. There is a basket with different fruits (apples, oranges, strawberries) as our dataset. You take one fruit an analyse. For now every fruit that is similar to the first fruit, we add it to the first one. For every other fruit we get a new basket, store the information about it and add similar fruits to their basket.
Compression
Another usecase is the compression of datas.
E.g. we have a dataset and each data in this has 50.000 attributes (way too many - and we don't even need every attribute for our training). So the machine tries to remove as much attributes as possible. As little inofrmation as possible should be lost to represent our dataset.
One example is the "Principal component analysis".
Unsupervised vs. supervised learning
In the following graphic we show the given real life examples for supervised and unsupervised learning, based on the given typical tasks for the learning type [Classification & Regression for Supervised learning and Clustering (also known as Cluster Analysis) for Unsupervised learning]
DEEP GENERATIVE MODELS FOR SYNTHETIC RETINAL IMAGE GENERATION - Scientific Figure on ResearchGate. Available from: https://www.researchgate.net/Examples-of-real-life-problems-in-the-context-of-supervised-and-unsupervised-learning_fig8_319093376 [accessed 21 May, 2018]
One final example with an usecase for unsupervised and supervised learning.
Let's take Netflix as our dataset.
- If we want to group the movies in their properties (e.g. men in their fourties watching movies with Scarlett Johansson) is a use case for unsupervised learning.
- Supervised learning would be perfect to group the movies in their genre.
References
http://oliviaklose.azurewebsites.net/machine-learning-2-supervised-versus-unsupervised-learning/