Any Japanese OCR lib/API recommendations?

d-hermit · September 5, 2021, 9:51am

Hello everyone!

I’m building an app (primarily to help my own studies) with a dictionary lookup.

Since I like reading paper books it’d be conveniet to take photo of a page an select the kanji to look up.

Currently Google Translate app has this feature but I don’t like how it works and I’d like to avoid using multiple apps. Which is why I want to build my own.

Do you have any recommendations for libraries or api to extract Japanese text from images?

So far I found 3 paid APIs:

The ABBYY one doesn’t even list prices, which isn’t a good sign…

MichaelCharles · September 5, 2021, 10:59am

Hey there, I’ve done some OCR work before with Japanese to build a web app which reads, sorts, splits and merges PDFs based on their content.

For that work, I did some investigation into available OCR APIs, and specifically I tried Google Cloud Vision, ABBYY Cloud OCR and Tesseract. I believe I tried some others, but I don’t remember what they were because I never got around to seriously considering them.

Google Cloud Vision performed the best overall for my purposes, but only just barely. Tesseract is an open source OCR project that was developed and maintained by Google from 2006 to 2018. It was hard to get working, but in the end was only slightly less accurate overall than the paid Google Cloud Vision service. Tesseract is also not a service, which could be either a disadvantage or an advantage depending on your project requirements, meaning you’ll need to either add it as a dependency in your code or host it somewhere yourself. For us the fact that it was not a service was a positive.

ABBYY Cloud, at least in our tests and for our purposes, was significantly less effective than Google Cloud Vision or Tesseract.

The main downside to Tesseract is that you’ve got to configure it (and potentially train it) yourself. Also Google Cloud Vision has built in features to process your photo and make it easier to read for their OCR software, with Tesseract any image processing that needs to be done needs to be done by you.

I’d recommend at least trying Tesseract. If it works for your purposes, then you don’t need to tie yourself down to a third party service.

I think the main Tesseract engine is written in C++, but there are ports for various languages you can find if you search. There’s even one written in pure JavaScript.

For our project, we went with Tesseract because, again at least for our purposes, Google Cloud Vision was something like 98% accurate, and Tesseract after configuring it was something like 96% accurate, and while Google Cloud Vision worked a bit better, it was more convenient for us to not rely on a third party service and we liked that it was free.

d-hermit · September 5, 2021, 12:04pm

Thanks for sharing your experience!

Since it’s just a personal project for me I don’t think I want to invest the time into using Tesseract but I see the appeal of not relying on a 3rd party service.

NicoleIsEnough · September 5, 2021, 2:17pm

Maybe @jprspereira can share some knowledge here? He ran some OCR experiments but I have no idea how it went…

2OC3aOdKgwSGlxfz · September 5, 2021, 10:13pm

Not sure if this is helpful as it’s not exactly an API, but I use Capture2Text as my main OCR while reading manga. It does have a command line version too, so potentially you could invoke it that way from an external application and then process the resulting text string. It’s also Open Source, so you could also embed part of the code into your own software.

A potential showstopper is that it’s Windows only.

I’ve been using it for many years, and at least for decently standard fonts it does a very good job, and even gets close enough for some more fancier fonts.

d-hermit · September 5, 2021, 10:50pm

Interesting, looks like it’s using Tesseract under the hood. Thanks for sharing.

Dolgoipa · March 1, 2023, 5:11pm

That’s a great idea for an app to help with your studies. I can understand why you’d want to build your own instead of using multiple apps.
Smart Engines OCR is a great tool for extracting Japanese text from images. It’s an optical character recognition (OCR) software that uses deep learning algorithms to accurately recognize and extract text from various sources, including images, documents, and even video streams.
I hope this helps, and good luck with your app development! If you have any other questions or concerns, feel free to ask.

eglepe · March 1, 2023, 7:25pm

I use Poricom for all my Japanese OCR needs. It can load a directory of images directly in the app so it’s great for both manga and scanned books, but it can also scan anything on the screen directly, which is handy for random text in videos too.

It’s based on tesserocr and the MangaOCR library so you might be able to use it for your project.

Topic		Replies	Views
How to extract text from japanese manga as txt file? Resources	3	2010	July 12, 2024
Kaku - Japanese OCR Dictionary API And Third-Party Apps	38	12454	August 25, 2022
Image to text conversion - Free OCR Website API And Third-Party Apps	10	2129	February 15, 2019
Recommendations for reading digital manga with OCR Reading	4	1421	November 20, 2024
Is there an OCR app out there that is learning oriented? Resources	7	706	September 1, 2020

Any Japanese OCR lib/API recommendations?

Related topics