Knowledge distillation - AshokBhat/ml GitHub Wiki About Transfers knowledge from a large model to a smaller model without loss of validity As smaller models are less expensive to evaluate, they can be deployed on less powerful hardware. See also BERT