I'm of course not suggesting branching in cases where you expect a 30% mispredic...

kllrnohj · 2025-04-18T19:17:47 1745003867

Those reductions need to be part of the function being benchmarked, though. Assuming a range limitation of [-pi,pi] even would be reasonable, there's certainly cases where you don't need multiple revolutions around a circle. But this can't even do that, so it's simply not a substitute for sin, and claiming 40x faster is a sham

dzaima · 2025-04-18T19:27:52 1745004472

Right; the range reduction from [-pi;pi] would be like 5 instrs ("x -= copysign(pi/2 & (abs(x)>pi/2), x)" or so), ~2 cycles throughput-wise or so, I think; that's slightly more significant than I was imagining, hmm.

It's indeed not a substitute for sin in general, but it could be in some use-cases, and for those it could really be 40x faster (say, cases where you're already externally doing range reduction because it's necessary for some other reason (in general you don't want your angles infinitely accumulating scale)).