Replying to myself with a question, as someone could have the answer: Would it be possible to create the splats without the training phase? If we have a fully modelled scene in Unreal Engine for example (like Matrix city), you shouldn't need to spend all the time training to recreate the data...
Of course! And this was done many times in the past, probably with better results than current deep learning based gaussian splatting where they use way too many splats to render a scene.
Basically the problem with sparse pictures and point clouds in general is their lack of topology and not precise spatial position. But when you already have the topology (eg with a mesh), you can extract (optimally) a set of points and compute the radius of the splats such that there are no holes in the final image (and their color). That is usually done with the curvature and the normal.
The 'optimally' part is difficult, an easier and faster approach is just to do a greedy pass to select good enough splats.
I could be wrong, but being able to remove the step of estimating the camera position would save a large amount of time. You’re still going to need to train on the images to create the splats
Yes, and then it gets interesting to think about procedurally generated splats, such as spawning a randomized distribution of grass splats on a field for example
To me the big issue is image quality versus generative efficiency. If splats make rending complicated scenes efficient without requiring a lot of data/calculation "scaffolding" then you could do almost everything procedurally, maybe using AI models to fill in definitional detail.