I first stumbled across VoiceBase back in 2015 when I was looking for a cheap way to produce a transcript of the PEI Home and School Federation Leaders Debate on Education. While I didn’t end up using their automated transcription system (I opted for human transcription from Rev.com), returning to look into where VoiceBase excels has been in the back of my mind since then.
It was jogged out by a recommendation by Mark Frauenfelder in last week’s Recomendo newsletter:
Dirt Cheap Transcription
VoiceBase takes audio recordings and turns them into text. It also analyzes the text to identify subjects and keywords, and can play back the audio as it highlights the text. It’s not as good as a human transcriber, but it does a decent job and is much cheaper (2 cents a minute compared to $1 a minute for a human). You get $60 in free credit to try it out, too. — MF
While Mark is right that VoiceBase isn’t as good as a human transcriber, the interface to its machine transcription is very interesting in that it presents a novel way to experience audio.
Witness this view into Live From the Formosa Tea House, Episode Five.
VoiceBase not only transcribes the audio, but it analyzes and tags the text by subject.
In the screen shot below, I’ve navigated to the keyword category “Popular culture,” and then clicked on “Cool (aesthetic).” The timeline of the audio highlights places identified with that tag, and I can click on any of them to both jump to that place in the audio, and in the text:
The first snippet tagged with “Cool (aesthetic)” came at 00:52:06, when Dan James described something as “a pretty cool thing” when talking about the Queen Street Commons:
And your team comes here and start to create relationships between us so it’s been a pretty cool thing to watch we had a first members meeting last night with as many members as can make it and it was really interesting to hear them talk about the ownership they feel now building.
The tagging is extremely detailed, including categories like “Populated coastal places in Canada” (Charlottetown and Vancouver in our episode), “Standards,” “Companies listed on the NASDAQ,” “RCA Record Albums,” and “Street Furniture.”
For long audio recordings from long ago, where the details have long been forgotten, diving in via VoiceBase provides a quick way to find the sections I’m looking for, and to easily copy and paste text for other purposes.
Of course all of this has a more Orwellian aspect to it: VoiceBase’s marketing isn’t focused on the obscure-podcast transcription market, as witnessed by the use of copy like “quickly identify ‘Hot Leads’ vs. ‘Non Prospects’ and determine the quality of your leads” and “phrase spotting allows a business to create custom groups of specific terms that will be automatically tagged when present. Simple logic can be applied to trigger an automated business process.”
It’s not difficult to imagine that this is exactly the sort of technology that intelligence agencies are using right now to process the collected telephone conversations of the world; if nothing else, VoiceBase provides a cheap way to get a visceral sense of how easy and effective it is to vacuum up and analyze audio.