In my opinion, the worst long-term consequence will be that even having newer CPUs with the issues fixed in hardware, we'll have a performance impact because of code compiled to work with both old and new CPUs. Just like the case of having a new CPU with fancy features unused because of code compiled to be backwards compatible.
Intels C compiler could generate code that detects CPU features at runtime years ago, I think the current GCC can do the same. Binaries only have to become a bit more bloated to store both versions of the compiled code.
No, it's on a per-function basis. On program startup it does the necessary checks (CPUID etc.) and sets up the function pointers appropriately (see the IFUNC mechanism in the linker).
That's OK for code e.g. you know it could benefit from SIMD usage. However you can not tag every function of user code for safe/unsafe mode. Also, optimizations would increase the mess (inlining, unrolling, etc.). Generated code would be a "Frankenstein".
Probably something like a table of function pointers for "hot" code that gets setup at program start. But compiler writers are way more clever than I am at this sort of thing, so I'm actually curious what solutions they came up with.