There is the image of a typist in a court room transcribing everything word that’s said, a stenographer typing into a computer the dialogue the dialogue in a sitcom for closed captions on TV, and a secretary tucked away in the corner of a conference room taking minutes for an integral business meeting. Humans transcribing speech is a cornerstone of society – from the time of the original orators to now. The demand for it has and will not disappear. Technology is being developed for this exact reason. However, the perception that it will be a one-for-one exchange of technology-for-human, with the technology simply improving the practice, is too simple. What it will also do is open new avenues and enable other boosts.
Cost and Ubiquity
Yes, speech to text transcription will largely improve on the human ability to do it. While there are still some business’s software which have growing pains, the general accuracy is improving. There are, though, businesses which boast accuracy at 99%. A key difference, though, is accessibility.
The cost of human transcription vs. AI-based transcription is stark. There are professional services which offer human transcription services: these are expensive. The AI-based alternatives are often far, far cheaper. Often, though, depending on the needs of individuals or businesses, the AI-based alternative is more ubiquitously available. Native apps on smartphones can be downloaded for free. These can be used to transcribe thoughts, for instance. Anyone from legal teams to fiction writers can use them as thoughts move faster than they can write: as they are jotting down their thoughts, the word they’re writing replaces the next one in their thought-sentence, and the thread is disrupted or lost. Though, of course, a writer’s recorded note might not fetch as much in an auction as their notebooks.
Additionally, this ubiquity helps differently abled people. Hearing impaired people will be able to have greater access to a variety of communicative tools to have simple conversations or engage with media, for instance.
Live Captions for Events
Live events are becoming increasingly internet-based and spontaneous. This is asking different questions of transcription services. While real-time closed captions have been available for live TV in some form for a while, it’s transition to a more accessible form for internet-based media enables more opportunities for people to access. The next generation of voice transcription services, built by businesses like Verbit, will be used by streamers who are playing video games to tens-of-thousands of viewers or influencers who just received new products from a sponsor and want to spontaneously go live and show them unboxing the products to their followers. The affordable cost and quality product allows smaller creators to cater to their audience, otherwise hearing impaired or non-native speakers could be excluded from their content – especially at a time when gaming companies are making consoles more accessible and developers are doing the same with their games, and platforms like Twitch are highlighting differently abled streamers.
All these benefits apply to the business industry too: online events and note taking. Being embraced by the wider b2b and enterprise industry is usually an indicator that the technology has reached a strong level, and that further implementation will only improve its technology.