Tesseract with MacOS

Hi all!

Tom from Sweden here (excuse poor grammar etc) this is my first post in this forum.
I am pretty new to automation but have grown more fond of it as the years pass by.
This past year I had my fair share of battles with OCR and automation.

My basic workflow is currently:

  1. Snap a photo of a bill with ScanBot for iOS.
  2. Photo is stored on iCloud.
  3. MacOS Hazel monitors the folder and picks it up as it does not have a OCR tagg.
  4. Hazel runs a workflow that opens Abby Fine reader and OCR´s it.
  5. Its tagged “OCR ready” and hazel can move it to appropriate folder.

Now the flow is all fine and so my problem is the OCR output that is not how do i put it -
stellar in parsing Swedish. So my journey began searching for a OCR software for macOS
that could be trained and to my surprise I found out no such software existed. The closest I came was using Brew to install Google´s Tesseract, and that got me much better result.
However training Tesseract seems very complicated.

Anyone have any ideas about alternate progs, gui´s etc for OCR training?



Tesseract is the core of OCRmyPDF which is free / open source. Installation instructions available here.

You can automate it using Hazel pretty easily. There is another discussion about this here.

1 Like