Supoose you make two simple types one for Kelvin K and the other for Fahrenheit F or degrees D.
And you implement the conversions between them in the types.
But then you have something like
d: D = 10;
For i=1...100000:
k=f_Take_D_Return_K(d)
d=g_Take_K_Return_D(k)
end
Then you will implicitly have many many automatic conversions that are not useful.
How to handle this? Is it easily catched by the compiler when the functions are way more complex?
In F#, which has measure types, the types are checked at compile time but erased at runtime, so they have no additional runtime cost. Measures are a kind of painted type.
[<Measure>] type degC;
[<Measure>] type K;
let degrees_to_kelvin (degrees : float<degC>) : float<K> =
degrees * 1<K/degC> + 273.15<K>
let d = 10.0<degC>
let k : float<K> = degrees_to_kelvin d
The .NET runtime only sees `float`, as the measures have been erased, and constant folding will remove the `*1` that we used to change the measure. The `degrees_to_kelvin` call may also be inlined by the JIT compiler. We could potentially add `[<MethodImpl(MethodImplOptions.AggressiveInlining)>]` to force it to inline when possible, then constant folding may reduce the whole expression down to its result in the binary.
The downside to adding the SI into the type system is the SI is not a sound type system. For example:
[<Measure>] type m
[<Measure>] type s
[<Measure>] type kg
[<Measure>] type N = kg*m/s^2
[<Measure>] type J = kg*m^2/s^2
[<Measure>] type Nm = N*m
let func_expecting_torque (t : float<Nm>) = ...
let x = 10.0<J>
func_expecting_torque x
The type system will permit this: using torque where energy is expected, and vice-versa, because they have the same SI unit, but they don't represent the same thing, and ideally it should be rejected. A potential improvement is to include Siano's Orientational Analysis[1], which can resolve this particular unsoundness because the orientations of Nm and J would be incompatible.
I interpret your question as «given that I am doing many conversions between temperature, because that makes it easier to write correct code, then I worry that my code will be slow because I am doing many conversions».
My response is: these conversions are unlikely to be the slow step in your code, don’t worry about it.
I do agree though, that it would be nice if the compiler could simplify the math to remove the conversions between units. I don’t know of any languages that can do that.
That's exactly the problem, in the software I have in mind, the conversions are actually very slow, and I can't easily change the content of the functions that process the data, they are very mathematical, it would take much time to rewrite everything.
For example, it's not my case but it's like having to convert between two image representations (matrix multiply each pixel) every time.
I'm scared that this kind of 'automatic conversion' slowness will be extremely difficult to debug and to monitor.
Why would it be difficult to monitor the slowness? Wouldn’t a million function calls to the from_F_to_K function be very noticeable when profiling?
On your case about swapping between image representations: let’s say you’re doing a FFT to transform between real and reciprocal representations of an image - you probably have to do that transformation in order to do the the work you need doing on reciprocal space. There’s no getting around it. Or am I misunderstanding?
Please don’t take my response as criticism, I’m genuinely interested here, and enjoying the discussion.
I have many functions written by many scientists in a unique software over many years, some expect a data format the others another, it's not always the same function that is called, but all the functions could have been written using a unique data format. However, they chose the data format when writing the functions based on the application at hand at that moment and the possible acceleration of their algorithms with the selected data structure.
When I tried to refactor using types, this kind of problems became obvious. And forced more conversions than intended.
So I'm really curious because, a part from rewriting everything, I don't see how to avoid this problem. It's more natural for some applications to have the data format 1 and for others the data format 2. And forcing one over the other would make the application slow.
The problem arises only in 'hybrid' pipelines when new scientist need to use some existing functions some of them in the first data format, and the others in the other.
As a simple example, you can write rotations in a software in many ways, some will use matrix multiply, some Euler angles, some quaternions, some geometric algebra. It depends on the application at hand which one works the best as it maps better with the mental model of the current application. For example geometric algebra is way better to think about a problem, but sometimes Euler angles are output from a physical sensor. So some scientists will use the first, and the others the second. (of course, those kind of conversions are quite trivial and we don't care that much, but suppose each conversion is very expensive for one reason or another)
If I understood the problem correctly, you should try calculating each format of the data once and reusing it. Something like:
type ID {
AsString string
AsInt int
AsWhatever whatever
}
function new type ID:
return new ID {
AsString: calculateAsString()
AsInt: calculateAsInt()
AsWhatever: calculateAsWhatever()
}
This does assume every representation will always be used, but if that's not the case it's a matter of using some manner of a generic only-once executor, like Go's sync.Once.
Supoose you make two simple types one for Kelvin K and the other for Fahrenheit F or degrees D.
And you implement the conversions between them in the types.
But then you have something like
d: D = 10;
For i=1...100000:
endThen you will implicitly have many many automatic conversions that are not useful. How to handle this? Is it easily catched by the compiler when the functions are way more complex?