Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ergo "multiplication is reasonable" rather than "very fast". 3 cycles doesn't seem like a ton to me (searching suggests it's actually 5, but that's still not a ton for a 64 bit multiply IMO).


Other than mathematical correctness, which was why John started working on unums, one of the coolest things about "Unums 2.0" as shown in this presentation is that all four arithmetic operations (add, subtract, multiply, divide) can take the same time. The 64 bit IEEE FPU my company is developing is able to do add and subtract in a single cycle, a multiply or fused multiply add in 3 cycles, and a divide in 17 cycles... all of which are pretty much the fastest reasonable implementations you can get.

To give you an idea of actual time to do those things, doing an add takes around 300 picoseconds, while the multiply is around 3,000 picoseconds. An order of magnitude increase in time to do an operation is a lot when it comes to hardware complexity (and thus associated area and power cost).


But, like I said, if big lookup tables work for Unums, they should also work for floats.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: