Here's an example from a Normalizing Flow. Good at density, not great at samplin...

Here's an example from a Normalizing Flow. Good at density, not great at sampling.

Here's a video of moving around in the latent space of a diffusion model

https://www.youtube.com/watch?v=vEnetcj_728

Here's a stylegan one

https://www.youtube.com/watch?v=bRrS74RXsSM

Or a VAE on mnist

https://www.youtube.com/shorts/pgmnCU_DxzM

I mean it is a bit hard to answer your other questions because like I was pointing out, embeddings and latent spaces are pretty vague terms. For the mathy side, normalizing flows are a great choice since you can parameterize whatever data you want into whatever distribution you want. You then work in that distribution you created, which is approximately isomorphic to the data. But other models do similar things, just more lossy but better at things like sample generation. That's the tradeoff, interpretability/density vs expressitivity/sample quality. But diffusion and NODEs/Score models are closing that gap. But you're going to need to look at applied papers to view more people using them in ways like operational vector spaces. For example, there's VITs TTS uses a NF to parameterize parts of the model or controller networks tend to use similar things. It's more about thinking how your network works and communicates. I think a lot of people are just not thinking to hack away and operate on networks as if they're mathematical models instead of a locked box.