Must Have Hazel Rules

Hi skylumer,
you are of course correct and I should have prefaced my post by saying: this only works for “expected” attachments, i.e. a monthly invoice. For non-recognized PDFs, it leaves them in my action folder for manual processing.

I do something very similar and this single thing makes Hazel invaluable for me. I treat my Desktop like a temporary folder. Like adabbagh, things that have been there over a week unopened get moved, in my case to a folder OldDesktop and will sit in that folder about a month before ultimately being moved to the Trash. OldDesktop is like a safety net. Hazel tags the files that are on the Desktop Yellow -> Green when they are new, then no tag, then Orange -> Red as they are about to be moved off. If my Desktop has become cluttered, the Yellow or Green tags helps me visually locate the more recent files while the Orange or Red flags tell me that those files are about to be moved to the OldDesktop folder. I tag items that I want to keep on the Desktop (like the folder OldDesktop) in Blue and Hazel is set up to ignore these.
Once I started considering my Desktop to be a temporary folder, life became a lot simpler for me and the Desktop metaphor suddenly became very useful. I “know” that if some version of something is on the Desktop it is not the “final” version. The Desktop is just a place where the bits and pieces of things that I am working on live and where I can send downloaded “stuff” that I need at the moment etc. The files there are easily seen, the icons provide visual cues and it is a very easy destination to access. Before Hazel, my Desktop contained a mish-mash of material. Now I “know” that if I left it there, I understood it had a limited lifespan. It is a very easy notion to assimilate.

4 Likes

I have a bunch of rules (sorting, filing, checking PDFs).
As I deal with a lot of PDFs (mostly scanned academic papers), Hazel is first checking if the PDF needs OCR and if it is landscape (so most scanned landscape PDFs are actually two pages, so Hazel makes two portrait pages out of the one landscape page), if OCR is needed it uses Abby Finereader to OCR the PDF. If the PDF is bigger then 30 MB I let abby run over it as it tends to make the file smaller. Then Hazel checks for certain keywords, if they are contained more than 10 times within the PDF it is tagging the file likewise. Then, if everything is finished, the PDF gets imported into Calibre. Word Dokuments, ePubs and Mobi files I put into the action folder are converted to PDF. As I have a dedicated PDF reader and need those files in a certain PDF format (certain size of pages, and page numbers).
For newspaper articles I download and want to read later it renames them with the date within the article and moves it to an iCloud folder which is synced with my tablet and phone.
As for receipts and things like that I have two separate file names and rules according these names – one where I have still to pay something and one simply to archive the bill: When I have to still pay the bill Hazel creates a reminder in the things.app and keeps the file in the folder until I tag it “payed” then it renames it for archive and puts it in DevonThink by year and month (the ones where I already payed are filed right away).

4 Likes

What tool are you using for the Word → PDF conversion?

I know you can do it with Apple script and Pages, and I’m going to figure out that tool at some point. But for now, I personally do it with this little shell script which I shamelessly stole from somewhere else, so I left the author info in there:

#!/bin/bash
# Jacob Salmela
# 2016-03-12
# Convert annoying DOCX into PDFs with a right-click
# Run this as an Automator Service

###### SCRIPT #######
for f in "$@"
do
  # Get the full file PATH without the extension
  filepathWithoutExtension="${f%.*}"

  # Convert the DOCX to HTML, which cupsfilter knows how to turn into a PDF
  textutil -convert html -output "$filepathWithoutExtension.html" "$f"

  retVal=$?
  if [ $retVal -ne 0 ]; then
      exit $retVal
  fi

  # Convert the file into a PDF
  cupsfilter "$filepathWithoutExtension.html" > "$filepathWithoutExtension.pdf"
  
  retVal=$?
  if [ $retVal -ne 0 ]; then
      exit $retVal
  fi

  # Remove the original file and the temporary HTML file, leaving only the PDF
  rm "$f" "$filepathWithoutExtension.html" >/dev/null
done
4 Likes

I use an applescript with microsoft word:

tell application "Microsoft Word" to set theOldDefaultPath to get default file path file path type documents path 

-- looks like we change the default path to where the document is and then set it back when we're done

try
tell application "Finder"
	set theFilePath to container of theFile as text
	
	set ext to name extension of theFile
	
	set theName to name of theFile
	copy length of theName to l
	copy length of ext to exl
	
	set n to l - exl - 1
	copy characters 1 through n of theName as string to theFilename
	
	set theFilename to theFilename & ".pdf"
	
	tell application "Microsoft Word"
		set default file path file path type documents path path theFilePath
		open theFile
		set theActiveDoc to the active document
		save as theActiveDoc file format format PDF file name theFilename
		close theActiveDoc
	end tell
	
end tell

end try
try

tell application "Microsoft Word" to set default file path file path type documents path path theOldDefaultPath	
end try
3 Likes

The only problem with this approach is that you have to pay for MS Word. I think you can get comparable results by writing an Apple Script using Pages.app for those who don’t want to shell out the dough.

I actually gave your script a try in Hazel and it doesn’t quite work for me. Could you show me how you have your Hazel rule set up? This is what mine looks like:

1 Like

Luckily I have a word version from my university. I have pretty much the same setup as you do and the script works fine for me, strange… have you given Hazel the permission to open and operate your mac? Which version of MacOsX and Word are you using? (Here Mojave 10.14.2 and Word 16.20)

Yesterday I tried to come up with a script for Pages thought I had one, but it does not work properly – actually randomly one time it worked, but then it stopped working…
I used this hint: Pages to PDF via AppleScript, but could not get it to work with hazel.

Do you know if I can pass page conditions to your shell script? like Page Size, Numbers etc.? Would love to do it with shell, because it would work in the background and there would be no need of an additional program in the front, but the page-size, numbers and pictures do not get transferred properly…

EDIT: Of course there is another option, if you use Calibre – I simply love this app! This is one script to convert epub, mubi, doc and docx to PDF. You can pass certain conditions to calibre at the end. Full documentation here: calibre-ebook-convert

FULL_PATH="$1"
ORIG_FILE=$(basename "$FULL_PATH") 
/Applications/Calibre.app/Contents/MacOS/ebook-convert "$ORIG_FILE" "converted/$(basename "$FULL_PATH" .epub).pdf" --custom-size=7x10,4 --base-font-size=11 --pdf-page-margin-right=50 --pdf-page-margin-left=50 --pdf-page-margin-bottom=70 --pdf-page-margin-top=50 --change-justification="justify" --pdf-footer-template="<p style=text-align:center;font-size:15>_PAGENUM_</p>"
2 Likes

Calibre is amazing. Keeps my digital books sorted!

What if I wanted to convert a bunch of .epub to .mobi or vice versa? I have several versions of one type that I want to convert to the other so I have both versions.

Author > Title > file_name.epub OR file_name.mobi

I guess i could do it via the Calibre converter but need to do that manually. Would rather have Hazel search and find scenarios where only one OR the other is present and convert what is ‘missing’

I think the problem with that is, that you would have to import the newly converted books again into Calibe, because calibre does not notice if a new file was created in its directory even though it is at the right place with the right format.
If you then import the new (lets say) epub to calibre you would have to put the metadata in again for the book and it will be shown as a separate entree in calibre and not as two versions of one book.
I would do batch conversion within calibre: so open calibre, sort books by format, click on all books that have only one format, and then convert them all at once… I think using hazel for that would cause more problems… (sadly as I love both apps very much)

I would do this too, but skip sorting by format and convert all - because when you tell it to convert it asks you what to do with books that already have that format and then you can skip them all with just a click :wink:

2 Likes

I use the following script to convert ebooks into other formats. I have 3 versions (1 for each format, mobi, epub and PDF) in a watched folder with the output saved to a final location.

FULL_PATH="$1"
ORIG_FILE=$(basename "$FULL_PATH")
/Applications/Calibre.app/Contents/MacOS/ebook-convert "$ORIG_FILE" "converted/$(basename "$FULL_PATH" .epub).mobi"
/Applications/Calibre.app/Contents/MacOS/ebook-convert "$ORIG_FILE" "converted/$(basename "$FULL_PATH" .epub).pdf"

For mobi files I then run the following to email them to Amazon so that they are available in my Kindle.

tell application "Mail"
set newMessage to make new outgoing message with properties {subject:"Hazel to Kindle", content:""}
tell newMessage
set sender to "my email address"
make new to recipient at end of to recipients with properties {address:"?@kindle.com"}
make new attachment with properties {file name:theFile} at after the last paragraph

Finally I load the epub version into Books (my preferred reader app) by having Hazel opening it with Application “Books”.

3 Likes

As all of the above the usual stuff with bills hazels opens abby finereader and runs a workflow to ocr them, hazel then sorts them.

  • It also unzips downloaded files and deletes the zip.
  • Removes empty folders from downloaded.
  • Removes old dmg´s
  • Zips and archives screenshots.
  • Sorts pictures movies music etc.

Hi @E_Thelonius

…I also have Abby Finereader to OCR but can’t seem to have it OCR the PDFs without showing the Abby UI. Do you know if there is a way to let it OCR without seeing the actual interface, so the process can be done in the background?

Thanks in advance.

As far as I know that is not possible. For me the window stays in the background though – when I am working it does not bother me because I run my apps in full screen… sorry I cannot help you. I just know that devonthink can do background ocr (its the AbbyFinereader engine, too). I am not sure about PDFpenPro, adobe acrobate and alike - maybe one can ocr in the background…

Edit: and welcome to the automators group :wink:

1 Like

Thanks for your reply, as well as for the welcome… :wave: I’m glad to be here as there is so much to find. Looking forward to more automation :smile:

I just bought hazel and I’m trying to get this to work, I set it up to run on a folder so if an epub is found in there it runs your top script, but it doesn’t seem to be working, any suggestions?

I also do a lot of automatic sorting/filing of PDF. I used to process those from download and inbox scanning folder. More recently, since I want to be able to perform those tasks from iPad/iPhone, I have been using a dedicated “Hazel” inbox folder on iCloud.
I save pdf there when using an iOS device (using a shortcut), the process is then done by hazel in “server” mode.
For complex workflow I also use Hazel to send some files to Integromat (this allow for example to build Google Sheets of invoices for tax declaration).
I am finally notified if necessary either by Hazel or Integromat using PushCut.

Hi,

Do you have Calibre installed?

Here is my code, check that your Shell path is correct.

Regards

Iain

1 Like