> the scalability of the technique was limited because it used high-order ration...

tomsmeding · on Aug 14, 2020

How do you propose glueing together two floating-point numbers? Concatenating the exponents and mantissas sounds good, but isn't implementable using the existing floating-point hardware because the operations are now intrinsically coupled between the two primitive numbers.

SideQuark · on Aug 14, 2020

The other posters are right, double double and quad doubles.

The way to think of them is for a real number r, a double d(r) is the closest value storable in a double, but there may be some error between the real value and the floating-point approximation.

Store that in another double, called dd(r) = r-d(r), and it will have a smaller exponent, but most importantly, it gives you twice as many bits of precision.

Then carefully make +-*/ operations, and you're off to the races.

I've implemented them from scratch on many platforms over the years, often to do deep mandelbrot runs, since they have really good performance for the between double and arbitrary precision libraries.

But it's always the same idea: one double as normal, and another double to represent the difference between the value you care about and the double that represents the higher order bits.

johndough · on Aug 14, 2020

Maybe the parent poster is talking about double-double arithmetic:

https://en.wikipedia.org/wiki/Quadruple-precision_floating-p...

Paper:

http://web.mit.edu/tabbott/Public/quaddouble-debian/qd-2.3.4...

guyomes · on Aug 14, 2020

Double-double and quad-double arithmetic have even been implemented for the GPU as research prototype libraries (gpuprec [0,1] and campary [2,3] among other).

[0] https://github.com/lumianph/gpuprec [1] https://event.cwi.nl/damon2010/gpuprecision.pdf [2] https://homepages.laas.fr/mmjoldes/campary/ [3] https://hal.archives-ouvertes.fr/hal-01312858/document

tomsmeding · on Aug 14, 2020

Interesting! Didn't know about that, makes sense.

nitrogen · on Aug 14, 2020

First pass random thought: maybe you can always set the minor exponent to the major exponent minus 53, then borrow and carry from the major?