one hot encoding - taoualiw/My-Knowledge-Base GitHub Wiki

One Hot Encoding

A one hot encoding is a representation of categorical variables as binary vectors.

This first requires that the categorical values be mapped to integer values.

Then, each integer value is represented as a binary vector that is all zero values except the index of the integer, which is marked with a 1.

exp : if we have three classes {a,b,c} -->{1,2,3}-->{[1,0,0],[0,1,0],[0,0,1]}

Using just the numbers {1,2,3} might work but there may be problems when there is no ordinal relationship between categories and allowing the representation to lean on any such relationship might be damaging to learning to solve the problem. An example might be the labels ‘dog’ and ‘cat’. When a one hot encoding is used for the output variable, it may offer a more nuanced set of predictions than a single label.

References

https://machinelearningmastery.com/how-to-one-hot-encode-sequence-data-in-python/

⚠️ **GitHub.com Fallback** ⚠️