I don’t know Hazel but from what you write it doesn’t sound like your need is Hazel-specific. Your requirements are really vague, other than reading and .xls
or .xlsx
file. What data are you looking to get from that file? Names of worksheets? Contents of cells? Metadata?
I can tell you for free: Both Excel file formats are hideously complex and unfriendly. .xls
(legacy format) is binary data. .xlsx
is a zipfile containing several XML documents, which may sound like an improvement [2]; alas, the XML data structure is just as baroque and brittle as the binary format, just with lots of tags in it.
Naturally, Microsoft Excel reads and writes both file formats like a pro. Third-party apps and libraries will vary in their level of file compatibility, feature support, but I don’t imagine you’re looking to do anything terribly complex so can likely find something “good enough” if you don’t want to involve Excel.app.
…
Using AppleScript means opening the file in Excel.app, pulling out whatever data you want via application commands, then closing the file without saving. It can be a bit annoying to have AS opening and closing files in Excel while you’re trying to work, but unless AppleScript’s “haunted house” effect is a deal-breaker for you I’d advise going with it as the simplest solution to implement. If Excel isn’t available, you could script Apple’s Numbers.app instead as it can also read Excel files (it won’t support all file features, and its AS support isn’t great, but you should be able to get sheet and cell data okay).
…
If you’d prefer to read Excel files directly, a third-party library such as openpyxl
(which reads and writes .xlsx
format only; you’d need to rummage on PyPI for another that reads .xls
) is probably adequate for your needs, although you’ll probably have to install Python 3 from python.org first as Apple no longer includes in in macOS by default. (Or, if you’ve already installed Xcode or its command-line tools, you’ll have it with that.) Bit of a pain, Python and Ruby being removed from macOS, although I can understand that decision Apple’s POV.
Or, instead of Python, you could install Node.js [2] and use Excel-reading libraries from npm. Same basic deal: you’re just writing a JS script instead of a Python script. Either way, the script is saved a plain text file with something like #!/usr/bin/env python3
or #!/usr/bin/env node
as its first line, and run as a shell script in Hazel. Obviously check whichever libraries you use come with decent user documentation (I know pyopenxl
has) with plenty of examples to crib from.
…
One more thought: macOS’s Quick Look appears able too peer directly into Excel files sufficiently well to display a basic preview of their contents. Alas, I’m not aware of a way to hook into the QL machinery programmatically to extract that file data for other uses. (Typical obtuse Apple.) Nor is there anything in the Cocoa APIs and other public macOS frameworks AFAIK. (There are public APIs for file formats like RTF and PDF. .doc
and .docx
are somewhat readable too, albeit buggy.) You could try rummaging the web for any sneaky tricks, but it’s probably a waste of time: if there was an [un]official way to do it, I’d expect it to be well known already—you’re far from first to want to read non-Apple files on macOS.
…
TL;DR: Use AppleScript to open Excel files in Excel.app and fetch their contents from there, unless that approach is impractical for some reason.
–
[1] I have written basic .xlsx
parsers before. My recommendation: just don’t. It’s a lot like staring into a basilisk, only far less pleasant.
[2] The Hazel docs say Hazel supports “JavaScript” as an alternative to AppleScript. But I’m assuming that means macOS’s JavaScript for Automation, which is crap, dead, and doesn’t support npm, so is functionally useless. Outside of iOS (where choice is restricted), Node.js is the only standalone JS that matters.