For practical reasons I usually don’t follow web pages that lack an RSS-feed, but there are a few exceptions. To compensate for the lack of an RSS-feed, I’ve made a Shortcut that regularly checks such webpages for new content. It uses the actions “Get contents of URL” followed by “Get URLs from input”, and then a bit of RegEx to extract the latest articles from the resulting list of URLs. Being native actions in Shortcuts, these actions frequently ask for permissions. I am constantly aware that the script may stop at any moment, staying offline until I physically get to the device to approve.
My hope is to be able to use Scriptable to replace the native actions, possibly even improving on the original Shortcut by extracting metadata (author, publication date) in addition to the URL.
I’ve been able to get hold of the raw HTML based off of a URL in Scriptable, but I’m stuck on how to extract anything useful from it.
const url = 'https://www.webpage.com/'
const webview = new WebView();
await webview.loadURL(url);
let response = await webview.getHTML();
console.log(response)
As you can probably tell from this code, my knowledge of Javascript is pretty basic. I suspect I’ll have to use some functions for getting HTML elements, but I can’t get it to work no matter what I do. Any tips on how to proceed from here would be appreciated.