OpenAI Launches Whisper API for Third-Party Developers to Integrate ChatGPT at Reduced Cost.

29 Apr 2023

OpenAI has announced that it has launched a new Whisper API that enables third-party developers to integrate its ChatGPT into their apps and services at significantly cheaper rates than using its existing language models. The Whisper API is a hosted version of the open-source Whisper speech-to-text model, which was released by the company in September 2022. It is an automatic speech recognition system that costs just $0.006 per minute and supports large-sized transcription in multiple languages, accepting various file formats such as M4A, MP3, MP4, MPEG, MPGA, WAV, and WEBM.
Despite the presence of competitive tech organizations such as Google, Amazon, and Meta, OpenAI's Whisper API stands out with its outstanding performance, as it is trained on 680,000 hours of multilingual and "multitask" data collected from the web. This affords it upgraded recognition features like unique accents, background noise, and technical jargon.

OpenAI's president and chairman, Greg Brockman, explained that the Whisper API is an optimized version of the same large model that is available as open source, and it is much faster and more convenient to use. The limitations in enterprises adopting voice transcription technology are accuracy, accent- or dialect-related recognition issues, and costs, according to a 2020 Statista survey.
"Our picture is that we really want to be this universal intelligence," Brockman said. "We really want to, very flexibly, be able to take in whatever kind of data you have and whatever kind of task you want to accomplish and be a force multiplier on that attention."


One limitation of Whisper is in "next-word" prediction, due to the enormous amount of data trained with the system. OpenAI cautions that Whisper might include words that weren't spoken in its transcriptions, possibly because it's both trying to predict the next word in the audio and transcribe the audio recording itself. Whisper's performance also varies according to the language used, with speakers of less well-represented languages in the training set experiencing a higher error rate.
OpenAI anticipates using Whisper's transcription capabilities to enhance current software, services, tools, and solutions. The Whisper API is already being used by the AI-powered language learning app Speak to enable a brand-new in-app virtual speaking companion. Furthermore, OpenAI breaking into the speech-to-text market may be quite profitable, with a single estimate placing the potential market value at $5.4 billion by 2026, up from $2.2 billion in 2021.

Play audio


Share:

Comments

No comments

Add your comment

Search Blog

Recent Posts

How to Develop a Strong Brand Identity for Your Tech Startup In today’s competitive tech landscape, establishin...
Navigating the Funding Landscape: Tips for Startups in Emerging Markets Startups in emerging markets face unique challenge...
MarkHack 4.0 Introduces the First Ever Nigerian MarTech Awards  Nigeria’s first-ever marketing and medi...

Related Post

MarkHack 4.0 Introduces the First Ever Nigerian MarTech Awards
 Nigeria’s first-ever marketing and media conference, MarkHack 4.0, is...
Meta Rolls Out New Facebook Measures to Suppress Spam and Support Real Creators
Meta has announced a series of new measures aimed at reducing spammy content o...
Logidoo Opens Global Trade Channels for African Businesses with Groupage Shipping
Logidoo, the pan-African logistics platform, has announced an important pivot to...
Logo

Accelerating the growth of Africa's tech ecosystem