Elastic Weight Consolidation - KCL-BMEIS/Methods_JournalClub GitHub Wiki

Presented by Irme 22nd of August 2019

Overcoming catastrophic forgetting in neural networks

James Kirkpatrick et al. (2017)

The ability to learn tasks in a sequential fashion is crucial to the development of artificial intelligence. Neural networks are not, in general, capable of this and it has been widely thought that catastrophic forgetting is an inevitable feature of connectionist models. We show that it is possible to overcome this limitation and train networks that can maintain expertise on tasks which they have not experienced for a long time. Our approach remembers old tasks by selectively slowing down learning on the weights important for those tasks. We demonstrate our approach is scalable and effective by solving a set of classification tasks based on the MNIST hand written digit dataset and by learning several Atari 2600 games sequentially.

Paper here

Discussion Points

Comparison with the work proposed by Hinton et al.

Distillation loss (Hinton, 2015) can be used to address catastrophic forgetting;
Activation of the last layer is compared between two networks (or the same, but different iterations) and the cross-entropy is used as a loss function;
Hinton et al. argue that by playing around with the softmax temperature they can teach a network what makes a datapoint part of one class, and not another.

Pros and Drawbacks

Will the EWC also preserve this knowledge?
Which would be preferable for preventing/lessening catastrophic forgetting?

Aplications to MI field

Micro-bleeds segmentation;
Unbalanced datasets.

Presentation

Presentation here