> Generative adversarial networks, or GANs, are a conceptual advance that allow ...

gyom · on July 7, 2021

You're right that this is spectacularly wrong.

I dare not even read the rest of the page just in case my brain accidentally absorbs other bad information like that paragraph about GANs.

whimsicalism · on July 7, 2021

> GAN's are a conceptual advance of generative models (i.e. models that can generate more, similar data).

This is something I've long had confusion with, coming from a probabilistic perspective.

How does a GAN model the joint probability of the data? My understanding was that was what a generative model does. There doesn't seem to be a clear probabilistic interpretation of a GAN whatsoever.

gyom · on July 7, 2021

Part of the cleverness of GANs was to have found a way to train a neural network that generates data without explicitly modeling the probability density.

In a stats textbook, when you know that your training data comes from a normal distribution, you can maximize the MLE wrt the parameters, and then use that for sampling. That's basic theory.

In practice, it was very hard to learn a good pdf for experimental data when you had a training set of images. GANs provided a way to bypass this.

Of course, people could have said "hey let's generate samples without maximizing a loglikelihood first", but they didn't know how to do it properly, how to train the network in any other way besides minimizing cross-entropy (which is equivalent to maximizing loglikelihood).

Then GANs actually provided a new loss function that could be trained. Total paradigm shift!

whimsicalism · on July 7, 2021

I'm on board with all of this, I think even before GANs it was becoming popular to optimize loss that wasn't necessarily a log likelihood.

But I'm confused by the usage of the phrase generative model, which I took to always mean a probabilistic model of the joint that can be sampled over. I get that GANs generate data samples, but it seems different.

hervature · on July 7, 2021

This is the problem when people use technical terms loosely and interchangeably with their English definitions. Generative model classifiers are precisely as you describe. They model a joint distribution that one can sample.

GANs cannot even fit this definition because it is not a classifier. It is composed of a generator and a discriminator. The discriminator is a discriminative classifier. The generator is, well, a generator. It has nothing to do with generative model classifiers. Then you get some variation of neural network generator > model that generates > generative model. This leads to confusion.

nl · on July 7, 2021

I find https://openai.com/blog/generative-models/ pretty good on this. Reading from "More general formulation" we see:

Now, our model also describes a distribution p^θ(x)\hat{p}_{\theta}(x)p^ θ (x) (green) that is defined implicitly by taking points from a unit Gaussian distribution (red) and mapping them through a (deterministic) neural network — our generative model (yellow). Our network is a function with parameters θ\thetaθ, and tweaking these parameters will tweak the generated distribution of images. Our goal then is to find parameters θ\thetaθ that produce a distribution that closely matches the true data distribution (for example, by having a small KL divergence loss). Therefore, you can imagine the green distribution starting out random and then the training process iteratively changing the parameters θ\thetaθ to stretch and squeeze it to better match the blue distribution.

This is precisely a generative model in the probabilistic sense. The section on VAEs spells this out even more explicitly:

For example, Variational Autoencoders allow us to perform both learning and efficient Bayesian inference in sophisticated probabilistic graphical models with latent variables (e.g. see DRAW, or Attend Infer Repeat for hints of recent relatively complex models).

The issue with GANs is that - while they model the joint probability of the input space - they aren't (easily) inspectable in the sense you can't get any understanding of how inputs relate to outputs. This means they appear different to traditional generative models where this is usually a goal.

_delirium · on July 8, 2021

For people who want a more stats-grounded approach, VAEs are more or less state of the art these days: https://en.wikipedia.org/wiki/Variational_autoencoder

They are reasonably competitive with GANs. I haven't kept up on the latest models on either side, but VAEs have historically tended to be a little blurrier than GANs.

317070 · on July 9, 2021

I think VAE's haven't been the state of the art since around 2016-2017? They have been squeezed from both directions, autoregressive models on the compression side, GAN's on the generation side.

They are still fairly competitive on both sides though.

_delirium · on July 9, 2021

Yeah, I guess I was thinking of VQVAE as a state-of-the-art example, but it was indeed 2017. Time flies! It's still pretty influential on newer systems though, e.g. OpenAI's DALL-E that made waves earlier this year has a VAE component (in addition to a Transformer component).

nancarrow · on July 8, 2021

The generator implicitly models a joint probability of data by being a generative process that one can draw samples from. GAN training (at least under certain simplifying assumptions) minimizes the JS divergence between the generator distribution and the data distribution.