It's not 15 years ago but I remember this[0] being shared on HN 5 years ago. I can't find them at the moment but I also do remember seeing whitepapers outlining the same techniques before that.
This one is interesting to me, now, considering who the author is.
I saw a talk at Defcon many years ago showing this. I think they just used statistics. I can't find the video, but here is the PDF of their presentation:
The study didn't use special characters and also used the same laptop keyboard, which I would imagine significantly inflates the percentage compared to a real world situation for this to be effective.
The particular method suggested is about using correlatable known text with audio situations to build the accuracy for other recordings of the same typist. I.e. a meeting with a chat feature or an office app may have correlated exanples.