Overview of Significant GANs Papers

4 min readAug 30, 2021

Overview of Significant GANs Papers

As I’ve discussed in my previous posts, GANs have been widely researched since their discovery in 2014 by [1]. A majority of the research has gone into studying the primary issues surrounding the use of GANs in practice and how to improve them. For those who are not familiar with GANs, they commonly suffer from the following problems:

(1) Unstable and slow training: In their original formulation, we need to maintain a careful balance between time spent training the discriminator and generator

(2) Mode collapse within the generator: there are many proposed causes to this ranging from frequency of

(3) A meaningless loss metric: the log likelihood loss function presented in Goodfellow et al.’s work does not have strong correlation to the quality of the samples the generator is producing. In most cases,we still resort to sampling generator out to determine the training of the network

Attempts such as those proposed in Wasserstein GANs[7], LSGANs[6], and EBGANs[9] propose changing the loss function to alleviate the problems of exploding and vanishing gradients during training thus aiding (1) and potentially (2). Recall, the original paper by Goodfellow et al. presents the problem of training GANs as minimizing the Jensen-Shannon Divergence between the data distribution and the generated distribution. Nowozin et al. extended the theory of GANs and showed that the generative adversarial approach shown in Goodfellow et al. was a special case of a more generic model. They showed that GANs can be successfully trained to minimize any f-divergence [3]. Mao et al. built on this result to introduce LSGANs (least squares GANs)[6] which are shown to be a special case of minimizing an f-divergence. This GAN uses a loss function which moves generated samples closer towards the decision boundary of the discriminator. This ensures that the generator does not learn to generate adversarial examples which are on the ‘correct’ side of the decision boundary and are still far away from the data manifold, since these examples will still be pulled towards the decision boundary. In this way, it can help prevent the generator from learning to produce poor adversarial examples.

On the other hand, Wassertein GANs consider a completely different approach to training GANs. In their prior work in ‘Towards Principled Methods for Training Generative Adversarial Networks’, Arjovskey and Bottou show that assuming the data distribution and the generated distribution share the same support is generally far too strong (as is assumed in many probability GAN papers)[8]. Furthermore, when we drop this assumption, Jensen-Shannon Divergence based training will be unstable and the generator will get very little feedback from the discriminator if it is allowed to learn quickly. This leads them to develop the Wasserstein GAN based off of the Wasserstein-1 metric (also called the Earth Mover’s Distance). Theoretically, they have shown that Wasserstein GANs are far less prone to mode dropping and tend to have increased stability in training. However, in practice, the Wasserstein-1 metric must be approximated and their results are not necessarily applicable to a model trained on an approximation of the Wasserstein-1 metric. Attempts have been made to remedy this such as those explored in ‘Improved Training of Wasserstein GANs’ [10].

GANs in the style of the original paper [1], the Wasserstein GAN [7], and LSGANs [6] frame the GANs training problem from the perspective of distribution matching. Zhao, Mathieu, and Le Cun show that GANs can be reformulated as an energy minimization problem [9]. They propose a framework where the discriminator is a function that assigns low energies to points on the data manifold and high energies to other points in the space. In this framework, the discriminator is trying to learn the energy function and the generator is trying to map points onto the data manifold (as to minimize the energy). In their experiments, they found that using a regularized auto-encoder for the discriminator to learn the energy function is very effective. In the adversarial training framework, they can train the auto-encoder and generative network at the same time. This allows the generator to produce better contrastive samples while the regularized auto-encoder learns the energy function. They found that EBGAMs are effective when compared to traditional GANs.

Finally, there have been several GANs which have been published who have had significant impact through architectural changes to the original GAN. There are many who have done this, but two of the most influential GANs in this area are Conditional GANs and DCGANs[2,4]. While neither of these papers included significant theoretical results, the frameworks and architectural guidelines they proposed have seen widespread practical use. Conditional GANs explored an idea mentioned in Goodfellow et al. They introduce a conditional argument into the generator and discriminator. This stabilizes and accelerates training while also reducing mode dropping. DCGANs introduced convolutional neural networks into adversarial training frameworks. They were the first to successfully train GANs whose networks used convolutional layers. These networks are very good for training adversarial networks for processing and generating image data.

While this is not an extensive list of every significant GAN developed in recent years, this serves as a list of some of the most influential GAN papers.

Resources:

[1] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair,Aaron Courville, and Yoshua Bengio. Generative adversarial networks, 2014.

[2] Mehdi Mirza, Simon Osindero: “Conditional Generative Adversarial Nets”, 2014; [http://arxiv.org/abs/1411.1784 arXiv:1411.1784].

[3] Sebastian Nowozin, Botond Cseke, Ryota Tomioka: “f-GAN: Training Generative Neural Sampler using Variational Divergence Minimization”, 2016; [https://arxiv.org/abs/1606.00709 arXiv:1606.00709].

[4] Alec Radford, Luke Metz, Soumith Chintala: “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, 2016; [https://arxiv.org/abs/1511.06434 arXiv 1511.06434].

[5] Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, Pieter Abbeel: “InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets”, 2016; [http://arxiv.org/abs/1606.03657 arXiv:1606.03657].

[6] Xudong Mao, Qing Li, Haoran Xie, Raymond Y. K. Lau, Zhen Wang, Stephen Paul Smolley: “Least Squares Generative Adversarial Networks”, 2016; [http://arxiv.org/abs/1611.04076 arXiv:1611.04076].

[7] Martin Arjovsky, Soumith Chintala, Léon Bottou: “Wasserstein GAN”, 2017; [http://arxiv.org/abs/1701.07875 arXiv:1701.07875].

[8] Martin Arjovsky, Léon Bottou: “Towards Principled Methods for Training Generative Adversarial Networks”, 2017; [http://arxiv.org/abs/1701.04862 arXiv:1701.04862]. [9] Junbo Zhao, Michael Mathieu, Yann LeCun: “Energy-based Generative Adversarial Network”, 2016; [https://arxiv.org/abs/1609.03126 arXiv:6109.03126].

[10] Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, Aaron Courville: “ Improved Training of Wasserstein GANs”, 2017; [https://arxiv.org/abs/1704.00028 arXiv:1704.00028].

- Brian Loos

Written by xNeurals

No responses yet