Hazel reading Excel Files

Hello Automators,

I want to see if someone can help me with Hazel and Excel files.

I get regular reports in Excel that have random names but the content is consistent and I would like to rename and organize them based on that. I tested the Contents > Contain Match > Custom Table tool in Hazel 5 and was able to accomplish this with a PDF or CSV export but it is unable to read the .xls or .xlsx file.

Does anyone have tips on how to read or search an Excel file so Hazel can rename it based on contents? Or if there is another application that can do this?

I know that Hazel cannot read Excel files but I have seen other people commenting about Openpyxl, Python3, and possibly Keyboard Maestro to read the file. I also saw this post referencing AppleScript, Pandas and I Sheet You Not but I don’t know how to make this work inside of Hazel instead of KM or Alfred.

Thanks in advance for any help!

I don’t know Hazel but from what you write it doesn’t sound like your need is Hazel-specific. Your requirements are really vague, other than reading and .xls or .xlsx file. What data are you looking to get from that file? Names of worksheets? Contents of cells? Metadata?

I can tell you for free: Both Excel file formats are hideously complex and unfriendly. .xls (legacy format) is binary data. .xlsx is a zipfile containing several XML documents, which may sound like an improvement [2]; alas, the XML data structure is just as baroque and brittle as the binary format, just with lots of tags in it.

Naturally, Microsoft Excel reads and writes both file formats like a pro. Third-party apps and libraries will vary in their level of file compatibility, feature support, but I don’t imagine you’re looking to do anything terribly complex so can likely find something “good enough” if you don’t want to involve Excel.app.

Using AppleScript means opening the file in Excel.app, pulling out whatever data you want via application commands, then closing the file without saving. It can be a bit annoying to have AS opening and closing files in Excel while you’re trying to work, but unless AppleScript’s “haunted house” effect is a deal-breaker for you I’d advise going with it as the simplest solution to implement. If Excel isn’t available, you could script Apple’s Numbers.app instead as it can also read Excel files (it won’t support all file features, and its AS support isn’t great, but you should be able to get sheet and cell data okay).

If you’d prefer to read Excel files directly, a third-party library such as openpyxl (which reads and writes .xlsx format only; you’d need to rummage on PyPI for another that reads .xls) is probably adequate for your needs, although you’ll probably have to install Python 3 from python.org first as Apple no longer includes in in macOS by default. (Or, if you’ve already installed Xcode or its command-line tools, you’ll have it with that.) Bit of a pain, Python and Ruby being removed from macOS, although I can understand that decision Apple’s POV.

Or, instead of Python, you could install Node.js [2] and use Excel-reading libraries from npm. Same basic deal: you’re just writing a JS script instead of a Python script. Either way, the script is saved a plain text file with something like #!/usr/bin/env python3 or #!/usr/bin/env node as its first line, and run as a shell script in Hazel. Obviously check whichever libraries you use come with decent user documentation (I know pyopenxl has) with plenty of examples to crib from.

One more thought: macOS’s Quick Look appears able too peer directly into Excel files sufficiently well to display a basic preview of their contents. Alas, I’m not aware of a way to hook into the QL machinery programmatically to extract that file data for other uses. (Typical obtuse Apple.) Nor is there anything in the Cocoa APIs and other public macOS frameworks AFAIK. (There are public APIs for file formats like RTF and PDF. .doc and .docx are somewhat readable too, albeit buggy.) You could try rummaging the web for any sneaky tricks, but it’s probably a waste of time: if there was an [un]official way to do it, I’d expect it to be well known already—you’re far from first to want to read non-Apple files on macOS.

TL;DR: Use AppleScript to open Excel files in Excel.app and fetch their contents from there, unless that approach is impractical for some reason.

[1] I have written basic .xlsx parsers before. My recommendation: just don’t. It’s a lot like staring into a basilisk, only far less pleasant.

[2] The Hazel docs say Hazel supports “JavaScript” as an alternative to AppleScript. But I’m assuming that means macOS’s JavaScript for Automation, which is crap, dead, and doesn’t support npm, so is functionally useless. Outside of iOS (where choice is restricted), Node.js is the only standalone JS that matters.

As the questioner behind that post, I can vouch for the AppleScript method. You should be able to hack it together with a little hunting on the forum, but if you’re having trouble I can dig up the shortcut I built after posting that thread. It reads the contents of a specific cell from a given sheet, so it probably does what you’re looking for.

As for renaming the file, my approach is usually to handle everything I can in Shortcuts and farm tasks out to AppleScript only when I need to. I would use the “return <variable>” at the end of an AppleScript action to feed the data back into a shortcut, and then use Shortcuts to set the name of the file.