SOLVED! Using Hazel to OCR a password protected document


#1

I receive a weekly email that has a password protected pdf. Same password is used every week. I would like Hazel to be able to rename and OCR it using Katie’s PDFPen OCR Applescript, but it can’t OCR without being unlocked. Is there an AppleScript (or something else that would work) that can unlock the PDF and save a copy that is not password protected? I could then have Hazel apply the OCR/rename rules to the unprotected one and file it since I don’t need it to be password protected in my files.


#2

One of the tools I use a lot for PDFs at work on my Windows PC is QPDF ("A Content-Preserving PDF Transformation System"). It’s a really handy command line utility when you are working with PDFs, and it’s available on lots of platforms including of course, the Mac.

Take a look at the decryption option in the QPDF basic options.

You can use Hazel to trigger a command line script that invokes QPDF to decrypt the file using a command like this:

qpdf --decrypt --password=some_password InputName.pdf OutputName.pdf

Once decrypted, you just go about OCR-ing as before; exactly as you suggested. If timing becomes any sort of issue when you try it, consider calling the AppleScript in the same script as qpdf by invoking the AppleScript using the osascript command.

Hope that helps.


#3

This looks like it would do the trick, but I looked into setting up QPDF and it is way over my head unless I’m missing something very easy. I’m not at a level to know how to install and set things up via command line scripts. I can dabble with AppleScript but that’s about the most complex I’ve ever gotten.


#4

You can install QPDF via HomeBrew (a package manager for Mac) or MacPorts (another ‘mostly’ package manager for Mac).

HomeBrew:
brew install qpdf

MacPorts:
sudo port install qpdf

Package managers, for want of a better analogy are like the Mac app store. There a means by which you can install and update software. These are, I believe, focussed on command line applications and in particular ones that are freely available (e.g. open source).

I would say that dabbling with the command line should be no more daunting than dabbling with AppleScript. It is just a different scripting language, or in fact offers a range of scripting languages - built right into the Mac. Installing a package manageer is straight forward. Installing QPDF using a package manager is like a line of script.

Don’t write off being able to do something before you try. The fact that you even asked how to do it and had pieced everything else together already, including the idea, suggests to me that you are more capable than you might believe :wink:


#5

Awesome. That’s very helpful. I will give this s shot. Thanks!


#6

This is great. QPDF is awesome!

The question now, is how to make hazel test if the PDF needs to be decripted.


#7

What happens if you try to decrypt a file that has no encryption? Is the process failing; or could you just do it every time just in case?


#8

Just tried it. Didn’t return any errors. Worked and generated the new “decrypted” file.


#9

Based on the processing required to determine if it requires decryption, it is likely to be a reasonable approach to simply take the precaution of running everything through a decryption step early in your Hazel process.

Very occasionally, you find that you have an automation problem you don’t need to fix because it isn’t really a problem. I think that’s the case here :nerd_face:


#10

QPDF was definitely the trick. It works perfectly!