OP here. Thanks for the feedback. I agree that frame to frame consistency is quite bad currently. I did address that in the post, hinting at some of the techniques others have mentioned here, like in/out-painting and masking previous frames.
For me, the exciting parts of this experiment was finding the opportunities and limits of realtime generation, and exploring ways of grounding generated content in a solid yet player controlled world layer.