Whisper AI to Day One?

Chrisp · September 13, 2023, 8:17am

I’ve been dissatisfied with dictation when creating Day One entries. I was planning on creating some kind of automation using Whisper AI (which I use a lot for transcribing audio and find very good) to add a Day One entry with that transcription (perhaps through the Day One CLI?).

Before I take the time, I thought I’d ask if this is a thing someone else has done?

If not, I’ll put it together. If I do that, is anyone also interested in this workflow?

Chrisp · September 13, 2023, 5:09pm

Had a few moments and threw together a VERY hacked together solution. Ha!

In short:

I invoke a KM script that opens my terminal app of choice and pastes the following ffmpeg -f avfoundation -i ":0" ~/automations/day_one_transcriptions/audiocapture.mp3
That script starts recording using ffmpeg and my default microphone, placing the capture in a folder (the day_one_transcriptions above)
When finished recording, I press ctrl + c to stop the recording
Hazel watches that folder and runs the following script: whisper $1 --language English --model large-v2 --fp16 False --output_dir . --output_format txt TRANSCRIPTION=$(cat "./audiocapture.txt") dayone2 --journal Journal new "$TRANSCRIPTION" --tags transcribed rm "./audiocapture.mp3" "./audiocapture.txt" curl "https://trigger.keyboardmaestro.com/t/2SAMPLE-SAMPLE?TriggerValue"
- That script:
  - transcribes the audio file using whisper AI
  - saves the output as a TXT file
  - the file is read and passed to the DayOne CLI
  - both the audio and txt files are deleted
  - a keyboard maestro trigger is hit that notifies me the process is complete

This is definitely hacky, but works for now. Not sure I like all the movement between tools, so I may just make this all into a single terminal script, create a raycast extension, or something else. But figured I’d pass on what I did so far.

Chrisp · September 13, 2023, 9:33pm

Allow me to continue this conversation with myself…

During lunch, I realized that it would be easier to run all this from Raycast. So now…

1. I run a Raycast Script

This starts a process recording my microphone.

Here’s the Script:

#!/bin/zsh

# @raycast.title Add Entry
# @raycast.author Chris Pennington
# @raycast.description Add a new entry to Day One.
# @raycast.mode fullOutput
#
# @raycast.icon dayone.png
#
# @raycast.packageName DayOne
# @raycast.schemaVersion 1

ffmpeg -f avfoundation -i ":0" ~/automations/day_one_transcriptions/audiocapture.mp3

Here’s the resulting view:

2. Stop process

I press ctrl + c to stop the process in Raycast.

3. Hazel reacts.

Hazel runs the script and uses a native notification to notify me when it’s done rather than the hack with KM.

Here’s the script:

whisper $1 --language English --model large-v2 --fp16 False --output_dir . --output_format txt
TRANSCRIPTION=$(cat "./audiocapture.txt")
dayone2 --journal Journal new "$TRANSCRIPTION" --tags transcribed
rm "./audiocapture.mp3" "./audiocapture.txt"
osascript -e 'display notification "Added to Day One!" with title "Transcription Complete"'

4. Notification displays

The native macOS notification runs to tell me everything is done.

I wouldn’t mind figuring out how to get ffmpeg to stop recording after a break of 5 seconds or something, but the hour or so I’ve spent is probably sufficient. I’m happy enough with the result. Leaving this for anyone else who wants to pick it up and run.

Chrisp · October 12, 2023, 12:06pm

As one more addendum. This became both a video and a post here: OpenAI Whisper: Transcribe in the Terminal for free - DEV Community