For my use cases, this already beats all "traditional approaches" for at least a few month now. That's just inferring from when I first stumbled across it. No clue for how long it's been a thing.
I did some OCR tests on some 1960s era documents (all in English). Mix of typed and handwritten. I had as results:
Google Vision: 95.62% HW - 99.4% Typed
Amazon Texttract: 95.63% HW - 99.3% Typed
Azure: 95.9% HW - 98.1% Typed
Then if curious, TrOCR was the best FOSS solution at 79.5% HW and 97.4% Typed. (However it took roughly 200x longer than Tesseract which was 43% HW and 97.0% Typed)
When did you do this test? I don't have any numbers handy, but a couple years ago I compared google's OCR vs AWS's on "text in the wild" pictures. AWS' wasn't bad, but it was definitely outperformed by the google one. The open-source solutions I tried (tesseract and some academic deep-learning code) were far behind.
This was a couple months ago now, so not that long ago. For OCR I have found that it highly depends on the type of image you are looking at. In my case these were all scanned documents of good but not great scan quality, all in English. I expect if you were talking about random photos with text in them, you'd see the FOSS solutions do much worse, and much more variance in the Google vs Amazon vs Azure. I would be curious about the academic deep learning one you tried.
The main one was https://github.com/JaidedAI/EasyOCR, mostly because, as promised, it was pretty easy to use, and uses pytorch (which I preferred in case I wanted to tweak it). It has been updated since, but at the time it was using CRNN, which is a solid model, especially for the time - it wasn't (academic) SOTA but not far behind that. I'm sure I could've coaxed better performance than I got out of it with some retraining and hyperparameter tuning.
Interesting. I tried easyOCR, I found on handwriting it was about 35%, on typed it was 95.7%, so not bad at all with typed, but for handwriting pretty bad. I focused on Tesseract and TrOCR since it wasn't working out that well, still could easily have just been my particular use case.
I also tested paddleocr and keras ocr to round them all out.
At some point I really need to finish my project enough to write up some blog articles and post a bunch of code repos for others to use.
I did not check. I also never checked if they share my mails on google search with you -- but I trust their ambition to not be sued into the ground for doing something immensely stupid.
Leaking sensitive data of enterprise customers as training material for public recaptchas falls in that category.
For my use cases, this already beats all "traditional approaches" for at least a few month now. That's just inferring from when I first stumbled across it. No clue for how long it's been a thing.