Hey there, I’ve done some OCR work before with Japanese to build a web app which reads, sorts, splits and merges PDFs based on their content.
For that work, I did some investigation into available OCR APIs, and specifically I tried Google Cloud Vision, ABBYY Cloud OCR and Tesseract. I believe I tried some others, but I don’t remember what they were because I never got around to seriously considering them.
Google Cloud Vision performed the best overall for my purposes, but only just barely. Tesseract is an open source OCR project that was developed and maintained by Google from 2006 to 2018. It was hard to get working, but in the end was only slightly less accurate overall than the paid Google Cloud Vision service. Tesseract is also not a service, which could be either a disadvantage or an advantage depending on your project requirements, meaning you’ll need to either add it as a dependency in your code or host it somewhere yourself. For us the fact that it was not a service was a positive.
ABBYY Cloud, at least in our tests and for our purposes, was significantly less effective than Google Cloud Vision or Tesseract.
The main downside to Tesseract is that you’ve got to configure it (and potentially train it) yourself. Also Google Cloud Vision has built in features to process your photo and make it easier to read for their OCR software, with Tesseract any image processing that needs to be done needs to be done by you.
I’d recommend at least trying Tesseract. If it works for your purposes, then you don’t need to tie yourself down to a third party service.
For our project, we went with Tesseract because, again at least for our purposes, Google Cloud Vision was something like 98% accurate, and Tesseract after configuring it was something like 96% accurate, and while Google Cloud Vision worked a bit better, it was more convenient for us to not rely on a third party service and we liked that it was free.