Hacker News new | past | comments | ask | show | jobs | submit login

https://www.agner.org/optimize/instruction_tables.pdf, search for MXCSR (LDMXCSR and STMXCSR instructions).

Keep in mind that twiddling these flags is going to require saving the MXCSR register to memory, or'ing or and'ing bits in memory, and then reading that memory back into MXCSR. And both saving and reading the MXCSR requires stalls, because floating point operations both read and write that register. So you require, minimum, 4 L1 cache hits and 2 partial pipeline flushes to twiddle a MXCSR bit.

(As far as I'm aware, modern microarchitectures generally don't register-rename the floating-point status register.)




Looks like many x86 cores still do rename MXCSR, though Gracemont notably doesn't: https://chipsandcheese.com/2021/12/21/gracemont-revenge-of-t...

Note that you wouldn't necessarily need to do a read-modify-write -- it'd suffice in most cases to just to save the old value and then reset the whole MXCSR for the scope requiring special treatment.


Also worth noting that it's not the entire MXCSR that needs to be renamed, but just a handful of status bits, so the logic is likely even cheaper than renaming a GPR.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: