Is there an automation platform that can read scanned documents and pull out specific data?

I work in retail and spend a lot of my time scanning in vendor invoices and jotting down information from them (I.e. invoice numbers, dates, etc.). Is there an automation platform (I’m thinking maybe a combination of hazel and devonthink) that can read the document and identify the date and invoice numbers, then rename the files with that data and put them in the appropriate folder? I’m not familiar with either of these platforms, but from what I’ve heard on the podcast this might be in the realm of possibility. Looking for some input before I spend the money on the software and dive in!

Edit: I can obviously train the automation for each specific vendor invoice. I’m not expecting it to just know where to look for the information from each specific vendor.


From the forums for those apps.


… use a match rule.


… use Hazel.

But, DT does give you a solution you can train for filing, that trains up the algorithm - so for less standardised content that would yield better results than filing with Hazel, which excels at the standardised - which seems to be what you have.

Hope that helps.


+1 for Hazel. Depending on the quality of the OCR, Hazel can do exactly this.

Looks like I’ll start with Hazel. Thanks!