It may make it sound like we're living in a SciFi world, but artificial intelligence is all around us. It's a fact. Every time you open a social media app, it's artificial intelligence that personalizes what you see on your feeds. Every time you say "Hey Siri", it's artificial intelligence that enables your phone to understand what you're asking.
AI transcription is another example of artificial intelligence being used in everyday life. But what exactly is AI transcription? Where did it come from, how is it used today, and how is it likely to be used in the future? Let's answer some of your burning questions...
AI transcription is the use of artificial intelligence to convert speech into text. Instead of a human having to physically take notes or transcribe an audio recording, AI transcription does the work for you, listening to your audio and translating it into text.
And the benefits of AI transcriptions (also referred to as speech recognition, computer speech recognition, or automatic speech recognition) are clear and tangible. They're fast -- the power of AI means you can get a transcript within minutes, if not seconds. Think how that compares to if you were writing out a recording by hand...
AI transcriptions are also typically much cheaper than using a human transcription service. That's because an hour of audio takes approximately four hours for a professional to transcribe, and the average price they charge is 75 cents to $1.50 per minute. That works out as $45-$90 per hour of audio transcription. By comparison, an hour of transcription time costs as little as $2 with Transcribe.
Things could get really technical here, so we'll keep it as straightforward as possible. Think about how a child learns a language. They hear speech around them on a daily basis, which trains their brain to build connections between sounds, words and their meaning.
Speech recognition technology works in a very similar way. Advanced machine learning and natural language processing techniques train computers to recognize sounds and build connections between those very same sounds, words and their meaning.
Speech recognition software listens to speech and compares what it hears to what's stored in its extensive library of words, expressions, and sentences, so that it can convert what it hears into text. And there you have it -- an AI transcription!
AI transcription isn't something that was born overnight -- it's something that scientists have been working on for decades. Let's take a look at the brief history of speech recognition.
1952 - The first ever speech recognition system -- named Audrey -- was built by Bell Laboratories. It could recognize the sound of a spoken digit (zero to nine) with more than 90% accuracy when spoken by its developer, but it was far less accurate with voices it wasn't familiar with.
1960s - At the 1962 World Fair, IBM showcased the Shoebox, which could understand 16 spoken English words. In the same decade, the Soviets created an algorithm capable of recognizing 200 words. All these were based on individual words being matched against stored voice patterns.
1970s - A program at Carnegie Mellon University, funded by the US Department of Defense, developed the Harpy, which had a vocabulary of over 1,000 words. The biggest breakthrough was that it could recognize entire sentences.
1980s - IBM created a voice activated typewriter called Tangora, which had a 20,000-word vocabulary and used statistics to predict and identify words.
1990s - At the very start of the decade, Dragon Systems released the first consumer speech recognition product -- the Dragon Dictate. In 1997, they released an upgrade called Dragon NaturallySpeaking. This was the first continuous speech recognition product, and it could recognise speech at 100 words per minute. Fun fact: it's still used today!
2000s onwards - AI speech-to-text technology has come a huge way in the past couple of decades, with Google leading the way with its voice search product, and the likes of Apple, Amazon and Microsoft giving it a good shot too.
AI transcription is used in a whole host of ways today. From dictating messages to your friends and family to asking Siri to perform a Google search for you, chances are you're already benefiting from AI transcription in one way or another.
AI transcription is also popular with a wide audience when it comes to getting written transcriptions of meetings, lectures, interviews and podcasts:
Businesses use it to get written notes from meetings, conferences, and Zoom calls.
Academics use it to generate lecture notes that they can share with their students, and to get transcripts of interviews they've conducted as part of their academic research.
Students use it to save themselves the trouble of note taking during lectures and seminars, receiving written transcripts within minutes of class ending that they can use for revision purposes.
Podcasters use it to get transcriptions to publish alongside their podcasts.
Journalists use it to get notes from interviews and press conferences, and to add captions to video interviews.
Let's dive into some data.
According to Statista, e-Learning and market research are the two main industries using AI transcription, with a 64% usage rate. This is closely followed by the software & internet industry, and the advertising & marketing industry.
The global voice recognition market size is forecast to grow from $10.7 billion in 2020 to $27.16 billion by 2026, and AI transcription will inevitably benefit from this growth. As investments increase, AI and machine learning capabilities will improve as the months and years go by. AI transcription will continue to become faster, more accurate and more accessible, making it more and more popular with those who currently use human transcription services or DIY transcription methods.
The more developed AI software becomes, the better it will get at understanding different accents and differentiating between different speakers. It may even be able to perform topic analysis and create summaries.
Ultimately, AI transcription will continue to make meetings more productive, increase workplace efficiency, and enable businesses and individuals to convert speech to text quickly, cheaply, and accurately.
Written By Katie Garrett
Find out the key differences between human and automatic transcription services, the advantages and disadvantages of each, and how to choose between the two.