Convolutional Capsule Generative Adversarial Network - KCL-BMEIS/Methods_JournalClub GitHub Wiki

Presented by Marta

18th of July 2019

Image Synthesis with a Convolutional Capsule Generative Adversarial Network

Cher Bass et al. (2019)

Machine learning for biomedical imaging often suffers from a lack of labelled training data. One solution is to use generative models to synthesise more data. To this end, we introduce CapsPix2Pix, which combines convolutional capsules with the pix2pix framework, to synthesise images conditioned on class segmentation labels. We apply our approach to a new biomedical dataset of cortical axons imaged by two-photon microscopy, as a method of data augmentation for small datasets. We evaluate performance both qualitatively and quantitatively. Quantitative evaluation is performed by using image data generated by either CapsPix2Pix or pix2pix to train a U-net on a segmentation task, then testing on real microscopy data. Our method quantitatively performs as well as pix2pix, with an order of magnitude fewer parameters. Additionally, CapsPix2Pix is far more capable at synthesising images of different appearance, but the same underlying geometry. Finally, qualitative analysis of the features learned by CapsPix2Pix suggests that individual capsules capture diverse and often semantically meaningful groups of features, covering structures such as synapses, axons and noise.

Paper here

Discussion Points

Would this be as effective on a multi-class problem? Could it work on more complex anatomy, e.g. for MRI or CT synthesis?
How much of the output variability is given by the capsules and how much is actually due to the latent space (as opposed to dropout used in Pix2Pix)? For what type of problems are capsules actually needed?
Are the convolutional capsules losing the benefits ”global” relationships introduced with standard capsules?

Pros

In computer vision, capsules have shown to successfully learn using only a fraction of the data
CapsPix2Pix 7x fewer parameters than Pix2Pix
More variation in the generated data
Successful data augmentation for a segmentation task
Features clustering and visualisation

Drawbacks

Computationally expensive (5x slower at training, 13x slower at testing)

Application to MI field

Image segmentation - part-to-whole

Presentation

Presentation here