These programs do thousands of iterations. How does one prove that the images show the Mandelbrot set and not some artifact of issues with numerical stability?
All the visualizations of the mandelbrot set I've seen use a threshold of iterations that define the edge. I think that threshold is more significant than any instability due to floating point numbers. A more formal analysis of numerical stability comes down to whether the iterated function is considered well-conditioned. I don't have the math to state it formally, but my intuition is that the mandelbrot function is, in that a small change to the function input will have a small effect on the number of iterations except at the very edge where the threshold will define the boundary anyway.
The iterative step is z -> z^2 + u, or in components a + bi -> (a^2 - b^2 + c) + (2ab + d)i
Multiplying and adding real numbers between -1 and 1 are fundamentally stable operations when done with floats. The only possible source of instability is if a and b are very close, that a^2 - b^2 will magnify some errors from the limits of numerical accuracy. However, this is mitigated by two things: the region of such numbers is very small, and also if z is in that region, z^2 + u will tend not to be. (The exceptional region, where both are small is now a small dot rather than a thin line) So while doing thousands of iterations, the possibility that for some numbers there are repeated iterations which tend to introduce error is either excluded or just very very unlikely (we don't need to rigorously determine which).
Now remember what we are doing - we're just checking whether the iterated value goes to infinity or to the region where it will stay bounded. We don't need very precise accuracy except if we are on the boundary, as another commenter pointed out. Any region of badly behaved numbers will only tend to break the algorithm if it overlaps the boundary region. So that's one more condition we've placed on our very small region of possibly bad starting values. Furthermore, even if there were lots of bad values located on or very near the boundary, they would only tend to move the boundary by a very small amount. It's easy to check that most values near the boundary quickly either run away to infinity or quickly converge towards the central region (technically towards orbits in the central region rather than limit points). We can quickly check the rough shape of the boundary using a very small number of iterations, fewer than could possibly go wrong using 64-bit floats. If you use say 100 iterations, but keep track of potential large errors, you will plot something that is indistinguishable from the actual Mandelbrot set, with only a very very small set of points where you are 'not sure' of the actual behavior. It's also quite easy to see that points on either side of the boundary will quickly iterate away from the boundary - that is basically the definition of the set.
So although it might be hard to either be extremely sure of the exact boundary details, nor to plot highly zoomed-in details with very good accuracy, it's quite easy to draw the set to a reasonable standard.
A lot of this math was developed by experts in computational and numerical mathematics who looked very hard at questions of stability. In fact lots of fractal and chaotic behavior was dismissed as due to numerical stability before it could be rigorously demonstrated that it was real. (Not the Mandelbrot set to my understanding.)
These programs do thousands of iterations. How does one prove that the images show the Mandelbrot set and not some artifact of issues with numerical stability?