Must Have Hazel Rules

What tool are you using for the Word → PDF conversion?

I know you can do it with Apple script and Pages, and I’m going to figure out that tool at some point. But for now, I personally do it with this little shell script which I shamelessly stole from somewhere else, so I left the author info in there:

#!/bin/bash
# Jacob Salmela
# 2016-03-12
# Convert annoying DOCX into PDFs with a right-click
# Run this as an Automator Service

###### SCRIPT #######
for f in "$@"
do
  # Get the full file PATH without the extension
  filepathWithoutExtension="${f%.*}"

  # Convert the DOCX to HTML, which cupsfilter knows how to turn into a PDF
  textutil -convert html -output "$filepathWithoutExtension.html" "$f"

  retVal=$?
  if [ $retVal -ne 0 ]; then
      exit $retVal
  fi

  # Convert the file into a PDF
  cupsfilter "$filepathWithoutExtension.html" > "$filepathWithoutExtension.pdf"
  
  retVal=$?
  if [ $retVal -ne 0 ]; then
      exit $retVal
  fi

  # Remove the original file and the temporary HTML file, leaving only the PDF
  rm "$f" "$filepathWithoutExtension.html" >/dev/null
done
4 Likes

I use an applescript with microsoft word:

tell application "Microsoft Word" to set theOldDefaultPath to get default file path file path type documents path 

-- looks like we change the default path to where the document is and then set it back when we're done

try
tell application "Finder"
	set theFilePath to container of theFile as text
	
	set ext to name extension of theFile
	
	set theName to name of theFile
	copy length of theName to l
	copy length of ext to exl
	
	set n to l - exl - 1
	copy characters 1 through n of theName as string to theFilename
	
	set theFilename to theFilename & ".pdf"
	
	tell application "Microsoft Word"
		set default file path file path type documents path path theFilePath
		open theFile
		set theActiveDoc to the active document
		save as theActiveDoc file format format PDF file name theFilename
		close theActiveDoc
	end tell
	
end tell

end try
try

tell application "Microsoft Word" to set default file path file path type documents path path theOldDefaultPath	
end try
3 Likes

The only problem with this approach is that you have to pay for MS Word. I think you can get comparable results by writing an Apple Script using Pages.app for those who don’t want to shell out the dough.

I actually gave your script a try in Hazel and it doesn’t quite work for me. Could you show me how you have your Hazel rule set up? This is what mine looks like:

1 Like

Luckily I have a word version from my university. I have pretty much the same setup as you do and the script works fine for me, strange… have you given Hazel the permission to open and operate your mac? Which version of MacOsX and Word are you using? (Here Mojave 10.14.2 and Word 16.20)

Yesterday I tried to come up with a script for Pages thought I had one, but it does not work properly – actually randomly one time it worked, but then it stopped working…
I used this hint: Pages to PDF via AppleScript, but could not get it to work with hazel.

Do you know if I can pass page conditions to your shell script? like Page Size, Numbers etc.? Would love to do it with shell, because it would work in the background and there would be no need of an additional program in the front, but the page-size, numbers and pictures do not get transferred properly…

EDIT: Of course there is another option, if you use Calibre – I simply love this app! This is one script to convert epub, mubi, doc and docx to PDF. You can pass certain conditions to calibre at the end. Full documentation here: calibre-ebook-convert

FULL_PATH="$1"
ORIG_FILE=$(basename "$FULL_PATH") 
/Applications/Calibre.app/Contents/MacOS/ebook-convert "$ORIG_FILE" "converted/$(basename "$FULL_PATH" .epub).pdf" --custom-size=7x10,4 --base-font-size=11 --pdf-page-margin-right=50 --pdf-page-margin-left=50 --pdf-page-margin-bottom=70 --pdf-page-margin-top=50 --change-justification="justify" --pdf-footer-template="<p style=text-align:center;font-size:15>_PAGENUM_</p>"
2 Likes

Calibre is amazing. Keeps my digital books sorted!

What if I wanted to convert a bunch of .epub to .mobi or vice versa? I have several versions of one type that I want to convert to the other so I have both versions.

Author > Title > file_name.epub OR file_name.mobi

I guess i could do it via the Calibre converter but need to do that manually. Would rather have Hazel search and find scenarios where only one OR the other is present and convert what is ‘missing’

I think the problem with that is, that you would have to import the newly converted books again into Calibe, because calibre does not notice if a new file was created in its directory even though it is at the right place with the right format.
If you then import the new (lets say) epub to calibre you would have to put the metadata in again for the book and it will be shown as a separate entree in calibre and not as two versions of one book.
I would do batch conversion within calibre: so open calibre, sort books by format, click on all books that have only one format, and then convert them all at once… I think using hazel for that would cause more problems… (sadly as I love both apps very much)

I would do this too, but skip sorting by format and convert all - because when you tell it to convert it asks you what to do with books that already have that format and then you can skip them all with just a click :wink:

2 Likes

I use the following script to convert ebooks into other formats. I have 3 versions (1 for each format, mobi, epub and PDF) in a watched folder with the output saved to a final location.

FULL_PATH="$1"
ORIG_FILE=$(basename "$FULL_PATH")
/Applications/Calibre.app/Contents/MacOS/ebook-convert "$ORIG_FILE" "converted/$(basename "$FULL_PATH" .epub).mobi"
/Applications/Calibre.app/Contents/MacOS/ebook-convert "$ORIG_FILE" "converted/$(basename "$FULL_PATH" .epub).pdf"

For mobi files I then run the following to email them to Amazon so that they are available in my Kindle.

tell application "Mail"
set newMessage to make new outgoing message with properties {subject:"Hazel to Kindle", content:""}
tell newMessage
set sender to "my email address"
make new to recipient at end of to recipients with properties {address:"?@kindle.com"}
make new attachment with properties {file name:theFile} at after the last paragraph

Finally I load the epub version into Books (my preferred reader app) by having Hazel opening it with Application “Books”.

3 Likes

As all of the above the usual stuff with bills hazels opens abby finereader and runs a workflow to ocr them, hazel then sorts them.

  • It also unzips downloaded files and deletes the zip.
  • Removes empty folders from downloaded.
  • Removes old dmg´s
  • Zips and archives screenshots.
  • Sorts pictures movies music etc.

Hi @E_Thelonius

…I also have Abby Finereader to OCR but can’t seem to have it OCR the PDFs without showing the Abby UI. Do you know if there is a way to let it OCR without seeing the actual interface, so the process can be done in the background?

Thanks in advance.

As far as I know that is not possible. For me the window stays in the background though – when I am working it does not bother me because I run my apps in full screen… sorry I cannot help you. I just know that devonthink can do background ocr (its the AbbyFinereader engine, too). I am not sure about PDFpenPro, adobe acrobate and alike - maybe one can ocr in the background…

Edit: and welcome to the automators group :wink:

1 Like

Thanks for your reply, as well as for the welcome… :wave: I’m glad to be here as there is so much to find. Looking forward to more automation :smile:

I just bought hazel and I’m trying to get this to work, I set it up to run on a folder so if an epub is found in there it runs your top script, but it doesn’t seem to be working, any suggestions?

I also do a lot of automatic sorting/filing of PDF. I used to process those from download and inbox scanning folder. More recently, since I want to be able to perform those tasks from iPad/iPhone, I have been using a dedicated “Hazel” inbox folder on iCloud.
I save pdf there when using an iOS device (using a shortcut), the process is then done by hazel in “server” mode.
For complex workflow I also use Hazel to send some files to Integromat (this allow for example to build Google Sheets of invoices for tax declaration).
I am finally notified if necessary either by Hazel or Integromat using PushCut.

Hi,

Do you have Calibre installed?

Here is my code, check that your Shell path is correct.

Regards

Iain

1 Like

One of my favorite time-saver rules is a filing, sorting and renaming of PDFs using info from the PDF itself. Info is pulled from either the download URL, or the contents of the PDF.

One example: I have to file sales taxes in multiple states. I download the filing and receipt PDFs to a standard downloads folder. Hazel watches my downloads folder, notes when matching criteria, and renames them (and moves/sorts them by date) all to something uniform for archiving:

/Sales Tax/2019/OH 2019 Q3 sales_tax filing.pdf

Every state uses a completely different naming convention for downloaded files, so this helps me keep my local records tidy with zero effort. I do similar things for CSV report downloads for my online stores.

And since download dates are offset from the actual sales tax filing period timeframe, I can use Hazel’s date offsets, or content matching tokens to grab the proper tax period for the filename.

Hazel has pretty amazing date-modification systems, offering less-common timeframes like quarters in addition to months/days/years/etc. So you can match the filing date in the PDF, have Hazel determine which quarter it was in, and use that for the “Q3” in the filename example.

This could obviously be done for banking, credit card, and other downloaded documents. Especially when the filename needs Include the time period, but it needs to differ from the download/creation date of the downloaded file.

Most of my other Hazel rules are pretty simple, sorting and renaming older files into dated sub folders, things already mentioned or obvious. These complex rules described here are the most useful and also the most satisfying to get working.

This ! This is what I’ve been wanting to achieve for ages, asked in forums here and there (probably not the right ones)

Could you please share how you do it? Thanks in advance.

PS: I wanted to login to comment, and shamefully noticed I didn’t have an account… I didn’t even realize! So here am I :sunglasses:

4 Likes

Hey Ian,

I have been looking for a solution to automatically converting epubs to pdf and came across your awesome post. I am a complete newbie and tried to use your script within Hazel but cannot get it to work.

I created a dedicated folder that hazel is watching for “epub” files (using the “kind” rule) and then asking it to run the shell script you posted. I manually entered the “/usr/bin/false” to make sure I have the right shell path…

Could you be so awesome (and patient) to provide some step-by-step guidance?

You would help me out a great deal!

Thanks so much in advance
Pete