#APaperADay 2. The GAN Landscape: Losses, Architectures, Regularization, and Normalization
GAN stands for Generative Adversarial Networks, which is two neural networks working together in a game trying to outwit each other. The networks function as adversaries. One tries to deceive the other, and is called the generator, with the other called the discriminator.
The generator is fed some numbers, usually randomly generated, and tries to convert that into an instance of an object, for example an image, and then pass it on to the discriminator. The discriminator then needs to say whether what it received was a real object, or something the generator produced. During training, the discriminator provides this feedback to the generator. The generator utilizes the feedback to improve its generation capabilities. This back-and-forth between the two networks is adversarial in nature.
GANs function in the domain of game theory, and are essentially a two-player zero-sum strategy game. This means that the training involves solving a minimax problem. If you would like to read up on Game Theory, please click here.
Training a GAN is not easy.
Because a GAN uses two deep convolutional neural networks, the optimization approaches that are utilized in those networks come into play in trying to solve the minimax problem. Anyone implementing a GAN will have to make a choice of CNN architecture, loss function, normalization, and regularization. The authors of this paper have taken a look at these options, discussed common problems that arise, and have written some code and deployed some models on TensorFlow Hub.
After considering a number of datasets, CNN architectures, loss functions, and hyper-parameters, the authors have arrived at the conclusion that the non-saturating loss is sufficiently stable, and that both gradient penalty and spectral normalization are useful in the context of high-capacity architectures.
In general, GANs try to solve a classification problem. The two common loss functions applied to them are minimax GAN and non-saturating GAN. Minimax GAN tries to minimize a negative log-likelihood. Non-saturating GAN tries to maximize the probability of the generated samples being real.
The authors explored three normalization techniques: batch, layer and spectral, and arrived at the conclusion that spectral normalization was best.
In the same manner, they also explored two GAN architectures: deep convolutional generative adversarial networks, and residual networks.
As is the norm with models in this domain, there is the need to evaluate the solution. That is, measuring the quality of the model. The authors explored various proposed metrics that are considered suitable for this type of problem.
The major problem with optimizing GANs is the extremely large number of hyper-parameter combinations that exist. It is currently impractical to try all of the possible combinations.
What the authors have done is hold certain hyper-parameters constant, and vary others systematically. They did this for the loss, normalization, regularization, and the network architecture.
In summary, the authors recommend the use of non-saturating GAN loss and spectral normalization as default choices. If computing resources permit, the propose adding gradient penalty before training to convergence.
The paper is available here.