Transcription + Audio > to Bear

Am looking for a solution to quickly record a voice memo and save both the audio file plus transcribed text into bear.

Currently I use a simple shortcut that just saves the audio - works fine to capture but I waste way too much time having to open and listen to each recording again afterwards.

Previously I used a shortcut that would directly transcribe and save the text in bear but too often that resulted in total gibberish due to my quick thoughts being a mixture of english plus native language within single sentences.

So ideally I’d like the transcription attempt to be saved in bear which in many cases allows me to quickly identify what the note/thought was about -
and in those cases where the transcription is gibberish I could still open the audio file as well.

Please could someone confirm if this would be possible to achieve with Scriptable or would it run into the same limitations as the shortcuts app where you can only handle either text or audio but not both at the same time?


I think the issue you have is that the native dictation feature (which I assume is what you have been using for transcription) captures the audio itself live fo translation. The record audio action likewise does the same but produces a media object/file. What Shortcuts doesn’ t have is a bridging action to pass a media object into dictation. I haven’t seen any other apps that can do that with the native dictation either; but that’s not to say someone hasn’t figured out a way to do that.

Assuming there isn’t a way, that suggests that you would need to use an alternative service. I’m pretty sure Nuance’s Dragon dictation might be an option, but it is liely to be an expensive one.

If your dictation is not sensitiv in nature, you may wish to look at web based translation services that offer APIs. You may then be able to upload a recording (audio file) from Shortcuts and receive back a transcription which you can then push to Bear.

Another option might be if you have an always on computer (e.g. a Mac running Hazel), then you may be able to use Shortcuts to send a recording to the computer (e.g. via iCloud, Dropbox) and have it run some software locally to do the transcription and then put it into Bear on behalf of your iOS device.

Hope that gvies you some ideas.

Thanks for your reply.
Yes that’s indeed the issue.
I don’t know if Apple only allows native apps to access Apple’s speech-to-text while still being able to store/handle the original audio file as well.

e.g. prior to Shortcuts I used the app Just Press Record which kinda does 50% of what I need except it froze way too often and of course does not have the quick one-swipe organization Bear has, resulting in a big pile of memos too slow to organize through.

So it’s definitely possible through Apple’s dictation framework but I suppose Apple will only allow access to native apps - was kinda hoping that Scriptable could perhaps be a gateway/solution for that.

Came across Google’s API as a possible workaround so will probably end up hiring a developer to link into that.

edit: noticed that Scriptable has dictation functionality as well but based on the documentation it does the same as what dictation in a shortcut does i.e. you get the text in exchange for your audio

1 Like