https://cloud.google.com/use-cases/ocr For my use cases, this already beats all ...

subbu · on May 29, 2023

have you tried Azure's OCR? https://learn.microsoft.com/en-us/azure/cognitive-services/c.... Is it comparable to Google's?

driscoll42 · on May 29, 2023

I did some OCR tests on some 1960s era documents (all in English). Mix of typed and handwritten. I had as results:

Google Vision: 95.62% HW - 99.4% Typed

Amazon Texttract: 95.63% HW - 99.3% Typed

Azure: 95.9% HW - 98.1% Typed

Then if curious, TrOCR was the best FOSS solution at 79.5% HW and 97.4% Typed. (However it took roughly 200x longer than Tesseract which was 43% HW and 97.0% Typed)

dimatura · on May 29, 2023

When did you do this test? I don't have any numbers handy, but a couple years ago I compared google's OCR vs AWS's on "text in the wild" pictures. AWS' wasn't bad, but it was definitely outperformed by the google one. The open-source solutions I tried (tesseract and some academic deep-learning code) were far behind.

driscoll42 · on May 29, 2023

This was a couple months ago now, so not that long ago. For OCR I have found that it highly depends on the type of image you are looking at. In my case these were all scanned documents of good but not great scan quality, all in English. I expect if you were talking about random photos with text in them, you'd see the FOSS solutions do much worse, and much more variance in the Google vs Amazon vs Azure. I would be curious about the academic deep learning one you tried.

dimatura · on May 29, 2023

The main one was https://github.com/JaidedAI/EasyOCR, mostly because, as promised, it was pretty easy to use, and uses pytorch (which I preferred in case I wanted to tweak it). It has been updated since, but at the time it was using CRNN, which is a solid model, especially for the time - it wasn't (academic) SOTA but not far behind that. I'm sure I could've coaxed better performance than I got out of it with some retraining and hyperparameter tuning.

driscoll42 · on May 29, 2023

Interesting. I tried easyOCR, I found on handwriting it was about 35%, on typed it was 95.7%, so not bad at all with typed, but for handwriting pretty bad. I focused on Tesseract and TrOCR since it wasn't working out that well, still could easily have just been my particular use case.

I also tested paddleocr and keras ocr to round them all out.

At some point I really need to finish my project enough to write up some blog articles and post a bunch of code repos for others to use.

j16sdiz · on May 29, 2023

Do they feed your data to recaptcha ?

jstummbillig · on May 29, 2023

I did not check. I also never checked if they share my mails on google search with you -- but I trust their ambition to not be sued into the ground for doing something immensely stupid.

Leaking sensitive data of enterprise customers as training material for public recaptchas falls in that category.

threeseed · on May 29, 2023

Google has scanned 40 million+ physical books and magazines

Which it used OCR to produce digital text.

So one source of training data at least.