The power of the optimizations available to C++ are what make it so fast (see how slow debug mode is vs -O2/etc), and what allow C++ to be fast in the face of common/easy-to-understand, but technically perf-hostile, patterns. Bit counting loops vs popcnt, auto-vectorization, DCE, RCE, CSE, CFG simplification, LTCG/LTO, and so on. These things let you write "high level" (to a point - there are some ways to do "high level" paradigms and absolutely eviscerate the compilers ability to optimize) code/algos and still get great hardware level performance. This is so much more important overall than the time it takes to compile your program, and even more so once you consider that often such programs are shipped once and then enter maintenance mode.
It doesn't really have anything to do with compatibility (not entirely, but the things that are the biggest issue to good optimization quality and are fixable are things that need a system-level rethinking on how hardware exceptions happen). It just isn't reasonable to expect developers to know how to optimize, and it doesn't scale.
In many contexts, one should rarely pass -O2/-O3. A project that is built thousands of times during development may only be run on intensive workloads (where -O2 performance is actually a necessity) a handful of times by comparison. A dev build can usually be -O0, which can dramatically improve compilation time.
It depends. O0 turns off a few trimming optimizations and could potentially causes more information (code or DWARF) to be included in the objects, which may eventually slow down the compilation. In our large code base, we found that -O1 works best in terms of compilation speed.
In https://ossia.io with PCH, using clang, ninja, mold, and some artificial split in shared libraries for development builds, I get a compile-edit-run cycle of a couple seconds in general... I wouldn't say it's too much of a problem if you use the tools already available
I can't really understand this take. Compilation times are tiny in most projects and manageable in the large ones. It's perfectly parallel and modules and other improvements reduce it by another order of magnitude.
Ninja, Icecream, ccache (I personally don't use that one), LLD or mold, breaking up the largest compilation units, avoiding internal static libraries at least for debug builds, not choosing the maximum amount of debug info... can result in edit-compile-run cycles under five seconds. Time for clean builds strongly depends on project size and template usage, obviously.
Try avoiding the standard library. Some headers like type_traits are so large and complex it can add a few hundred milliseconds onto each CU that ends up including it.
I would gladly use a new version of C++ that breaks compatibility but solve this problem.
This is the largest barrier for C++ right now.