It is funny how they are still touting that C++ exceptions are/should not (be) slow and that we should "complain to your implementation purveyor" if they are. Yet, the two most used C++ compilers today with top-notch optimizations (GCC and Clang) are still slow with exceptions.
Took this code [2] and tested with O2 optimizations with gcc 11.x and clang 13.x versions:
$ ./exceptions-test-gcc
3906 return code errors in 0.000562 seconds
3906 exception errors in 0.005740 seconds
Exceptions are 10.2x slower than return codes
$ ./exceptions-test-clang
3906 return code errors in 0.000142 seconds
3906 exception errors in 0.004568 seconds
Exceptions are 32.2x slower than return codes
Performance benchmarks can be tricky, especially when used to evaluate a feature of a programming language. In this case, by adding "no_inline" to both the version and pre-initializing the exception object, the difference in performance on my machine decreased to 2.8 times from 22 times. The question is, which benchmark version provides a better assessment of the feature.
The discussion about exception performance mainly focuses on the runtime performance implications of the exception process, which should be negligible. However, due to the current semantics of exception handling, the throw/catch process significantly affects control and data flow, leading most compilers to be more pessimistic about optimization.
The debate between error codes and exceptions seems like a false choice to me. Both have their strengths and weaknesses and should be used interchangeably in the same codebase depending on the circumstances.
In my opinion, the main advantage of exceptions is when there is a large distance between the point of error generation and the point in the program where there is the best information to handle the error. In such cases, monadic error handling would require every intermediary computation to handle the error, whereas exception propagation allows unrelated computations to essentially ignore the error state.
I would even suggest that C++ needs something like "Conditions and Restarts" as a more general exception mechanism.
> In this case, by adding "no_inline" to both the version and pre-initializing the exception object, the difference in performance on my machine decreased to 2.8 times from 22 times.
That’s not a realistic benchmark though. I’ve never seen real-world non-benchmark code which pre-initializes exceptions. They’re always created completely from scratch. Beyond it being uncommon, pre-initializing would also unnecessarily waste memory when you don’t throw the exception.
When it comes to no_inline, I think one should go the opposite direction. Because no_inline here just showed that error codes can be made slower when compiler knows less about code, bringing it closer to exceptions. What’s disputed though is not that you can make ordinary code slower, but that compilers will realistically make your exception-throwing code faster. So I’d annotate with always_inline instead.
Yeah, that why i mentioned that benchmarking features is always contentious and it's easy to devolve into semantic argument... but here we go.
I guess it depends on what you mean by realistic and what is the benchmark trying to do. If by realistic, we mean "representative" in a sense that this code captures a good approximation o real work code, then i would argue that the original benchmark is completely wrong.
My version focus on being precise : measuring one thing (exception handling) in isolation as much as possible.
With that perspective :
1 - The cost of error handling also depends on the amount of information being reported. In the orginal benchmark, on one side the author is sending basically one bit of information and the other a full string. Pre-allocating the error was a way to normalize for that.
2 - I looked at the code, and most of performance delta between the two version came from the error code being vectorize, no_inline was a way to normalize for that as well.
Your benchmark supports the position that exceptions should be used instead of return codes for better average case performance, at least for GCC.
If your function throws an exception 9% of the time, then it outperforms error codes. That's because you always pay for the cost of an error code, regardless of success, but you only pay for the cost of an exception on failure.
I suspect most functions don't throw an exception 9% of the time that they are called.
For clang, yes the situation is worse, indicating that if your function throws an exception ~3% of the time it's better to use error codes, but once again I don't think most functions throw an exception 3% of the time.
If you need to optimize for latency, or real time performance, then exceptions are not at all suitable and I think most performance sensitive developers who have those requirements already know that.
In said benchmark the function threw an exception 3906 / 1000000 = 0.4% of the time and it was from 10 to 32 times (edit: 73.2 times on my Mac Mini M1) slower on average. 9% would be much worse.
EDIT: updated the ratio to reflect the GP comment instead of the SO post.
The performance of the happy path matters a lot more, and I don't know of any compiler smart enough to return to different addresses for good and bad error codes. How much is the penalty for always checking a code after returning, compared to a try block that didn't catch anything (which should be free, apart from a compile time unwind table)?
Once I was in a training session on C++ STL, and the instructor told us not to use vector at() method but instead check the size and handle out-of-bounds issues using something else, because exceptions are slow.
Coming from another language, this sounded wild to me. There is a perfectly fine API in the standard library that we choose not to use but instead reinvent the wheel because of an "implementation detail" of a compiler, something a beginner or even an intermediate level C++ developer is likely unaware of.
Unfortunely C and C++ have too many folks that do performance analysis by gut feeling instead of using a profiler, many times asserting stuff that with modern hardware are completly irrelevant.
Would like to hear more about it. Are you saying this (using .at vs "manual error handling") is premature optimization that might not have any effect on modern hardware?
I am saying any optimization that isn't guided by profiler results, rather what people say, is nonsense.
When I started coding, my first computer was a Timex 2068, containing a Z80 running at 3.5 MHz, with 64 KB, 48 KB available for data, loading data by tape.
The first computer where I was programming in C++, was a 386SX runnning at 20 Mhz (with a turbo button for 40 Mhz, yuupi!), 2 MB RAM (only accessible with extenders, otherwise 640 KB), 40 MB hard disk (later converted to about 80 MB thanks to disk compression tools of the day), with DR-DOS 5.
My current phone, holds a Exynos 1280, with 2 cores at 2.4 GHz and 6 cores at 2.0GHz, with 8 GB RAM, and SSD storage.
And my phone is a toy, when compared with what a standard desktop computer, or servers are capable of in 2024.
Additionally all modern hardware architectures are NUMA, multi-core, hybrid with GPUs.
So unless someone can really prove that their gut feeling is going to fail on the expected deployment hardware, really fail flat on the floor, like the execution deadline is 5 ms, and the application can't do less than 5 ms no matter what, they are needlessly cargo culting optimizations.
I don't want to give to much credibility to the benchmark OP posted because it is quite flawed, but MSVC performs absolutely terribly on that benchmark, far worse than GCC or clang with a 58x slowdown when compiled in 64-bit mode. I can only imagine how much worse it would be if compiled in 32-bit mode since MSVC in 32-bit mode does not use zero cost exception handling but uses a frame based, setup/teardown exception handling mechanism.
The exception handling you're referring to about pinging back to Win32 I presume is structured exception handling, which is not the default exception handling mechanism used by MSVC, is not recommended to be used, and is not at all performance sensitive.
MSVC's C++ exceptions are based on Win32 exceptions, and are visible from an SEH handler. They're thrown via RaiseException() with an exception code of 0xE06D7363. Although mixing them is not recommended due to optimization concerns, C++EH scopes will unwind upon throwing an SEH exception and vice versa.
However the point still stands, that isn't something compiler vendors see commercial value so to speak, when they are so constrained to keep up with ISO revisions, and a big slice of the customer base keeps disabling them.
In general, yes i would agree that modern C++ compiler could do a much better job on exception enable code, especially in term of data flow analysis.
But in this case, the benchmark it's self is kinda flawed. The semantic of the two code a very different so there isn't much the compiler could have (reasonably done).
Took this code [2] and tested with O2 optimizations with gcc 11.x and clang 13.x versions:
[1] https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines...[2] https://stackoverflow.com/a/78301673