As recently as a few years ago, speech-to-text was nowhere near as capable as it is now. It was highly unreliable and barely considered for voicemail transcription. Now, as the technology advances and speech recognition becomes more accurate, speech-to-text is used for a variety of different applications, including voicemail transcription offered by some telecom providers.
You see it in action all the time when your device lets you use your voice to type on your computer, speak your text messages using dictation, so that you can keep your eyes on the road (which we all do, of course), and even tell an automated voice what menu option you want when you call a business, rather than press any numbers.
We just posted an article that goes into more depth about how Automatic Speech Recognition works with the IVR on a phone system, which you can find here:
Speech-to-text accuracy isn’t up to spec.
The thing is, speech-to-text programs are still not all that accurate. Wired magazine quotes a senior scientist at Microsoft as saying that “commercially available systems” have about a 12 percent error rate, which apparently is roughly twice as bad as humans are at understanding speech.
As far as a voicemail transcription feature is concerned, there are providers who offer it, but many do not, due to this significant error rate.
A small error makes a big difference.
A 12 percent error rate doesn’t seem too bad, at a glance, but when you think about it, this can be a considerable problem for voicemail transcription. A few words either missing or mistakenly transcribed by the software could make a lot of difference.
If the program transcribes the wrong account number, for example, that might result in lost or misplaced funds, or changes made to the wrong account. The program might even transcribe something that makes you decline to call someone back because of what you think they said, when the caller never actually said it.
The phone network connection doesn’t help, either.
Another aspect that poses a problem for voicemail transcription is the same reason that the game of “Telephone” was invented. The telephone medium does not provide very good sound quality, and there isn’t really any getting around that, as much of the network infrastructure is still hopelessly antiquated.
It’s hard enough for us to understand each other over the phone, and computers don’t fare much better, which is why voicemail transcription, at least for now, is not as reliable as it needs to be.
Of course, the technology will improve, and in fact the error rate for some speech-to-text programs are actually starting to get closer to 9 or 8 percent. Hopefully one day soon, we will be able to completely trust a program to transcribe our voicemail messages and make life a little easier.
We hope that you will continue to join us on our journey as we help you Grow Your Business! Our blog is 100% free and you don’t have to be a Talkroute customer to benefit from our materials. However, if you would like to try Talkroute’s Virtual Phone System for free, you can sign up for a trial here. See you in a few days.
1. Jesse Jarnow, “Why Our Crazy-Smart AI Still Sucks at Transcribing Speech,” Wired, April 8, 2016, https://www.wired.com/2016/04/long-form-voice-transcription/