This can generate new text. If the abilities generalise somewhat (and there is lots of evidence they DO generalise on some level), then there is no obstacle to generating new proofs, although the farther away they are from the training data, the less likely it becomes.
For an obvious example of generalisation: the models are able to write more code than there is in the dataset. If you ask it to write some specific, though easy, function, it is very unlikely it is present verbatim in the dataset, and yet the model can adapt.