Get text from webpage

If there a way to just get the text from a web page? I can get the html from a web page but need just the text.

Some possible solutions I can think of:

  • Load jQuery into the website (preferably with a data URL, so it isn’t dependend on your internet speed), and when loaded, run $("body").text(); in the website. This returns the whole text.
    1. Download node.js on your computer
    2. install browserify and cheerio via npm
    3. run browserify -r cheerio
    4. copy the output file to your iCloud Drive into the Scriptable folder (or any subfolder)
    5. in your Scriptable script import the file from above like const cheerio = importModule("lib/cheerio")("cheerio"); (assumed, you’ve placed the file at lib/cheerio.js). Why the second call? Because the file from browserify returns a function to load different modules and we want to load cheerio
    6. Load the HTML from your page in question
    7. Load the HTML into cheerio with let $ = cheerio.load(html);, where html contains the HTML
    8. Get the text with $("body").text()

There might be some errors, I haven’t tested any of this

1 Like

This could be improved, but If you want to use shortcuts, this might get you started.

From safari, share webpage to shortcut.

Preview Text of Webpage

Yes that works…sometimes. It’s a website that has frames. So sometimes I get the body and sometimes I don’t. But then I just put in an if statement that says if body has no value run the shortcut again. Not pretty but hey it works. Thanks!