I’m working on going paperless and have a few years of credit card statements and other documents with magic numbers I would not like to have in the cloud. I’m scanning all of my documents into a home server running Ubuntu Server and running ocr on everything. I want this whole process to be as automated as possible. Here is my ideal workflow.
I scan everything into a folder in the server and run ocr.
Each file gets processed and categorized (utility bill, credit card statement). Account numbers and sensitive information removed/redacted. Renamed based on a naming convention and placed into a a folder hierarchy, as well as automatically backed up.