Whisper ASR by OpenAI - AI Transcription Like a Pro (2023)

Your WEBSITE is DEAD Without This AI

What is OpenAI Whisper ASR?

OpenAi’s new Whisper AI is an incredible Automatic Speech Recognition system (ASR) that is able to listen to what we say and transcribe it with mind-blowing accuracy. Your voice goes into the AI and the text comes out and the results work with flying colors. The results are just unbelievable and as good as humans.

Related: What is ChatGPT By OpenAI?

Whisper is a Neural net that approaches human-level robustness and is highly accurate in English speech recognition. Check out the Whisper Model Card here. Whisper is a Neural net that approaches human-level robustness and is highly accurate in English speech recognition.

It is trained on 680K hours of multilingual, multitask supervised data collected from the web. This will improve the robustness of accents and technical language. including background noise. The best part about the whisper AI is that it allows multiple language transcription and again translation from any language to English. To make this application more useful OpenAI has made it an open-sourcing model and inference code.

What is Transcribing?

Transcribing is the process of converting audio recordings or speech into written text. This can be done by a real human transcriber who listens to the audio and types out what is being said. This can also be done by using speech recognition technology, such as Automatic Speech Recognition (ASR) to automatically transcribe the audio.

Disclaimer: This article on OpenAI Whisper contains affiliate link, which means we get a small commission if any purchase is made. This however will not add any extra cost to you, but will help us in producing more valuable contents in the future.

Transcription is used in a variety of contexts, including for subtitling videos and captioning, in legal and medical contexts to document spoken statements. It is also done in business and education to transcribe lectures and meetings, and in research and journalism to transcribe interviews. The output results from transcribing is often a written document in text format, but it can also be in other formats such as .vtt for captioning, .srt for subtitles, etc.

What is Automatic Speech Recognition (ASR)?

Automatic Speech Recognition also known as ASR, is the technology that allows a computer or other device to convert spoken words into written text. It's a form of artificial intelligence (AI) that can understand spoken speech or language and transcribe it into a written format. ASR is used in a variety of applications, such as voice-controlled devices, voice assistants, and speech-to-text dictation software. The ASR technology is based on machine learning algorithms that are trained on large datasets of speech samples in order to learn, recognize and transcribe speech with high accuracy.

Related: What is DALL-E 2?

How To Install OpenAI Whisper & Deploy It Using Python?

Installing Whisper AI is quite simple and easy, as it doesn't require you to pay any amount to use it. However, you just need to have a few prerequisites like a fast system with Python Installed. Your system needs to have an NVIDIA Graphics card, which supports CUDA.

Python Installation:

In the Command Prompt Window - Type (python -v) & press enter.
As suggested in GitHub we prefer Python v3.9.9.
If you have a 64-bit CPU download the 64-bit window Installer.

First Time Installing Python:

Remember to Check "Add to PATH" which appears on the first page.

Multiple Python in the System:

If you already have python installed, make sure not to tick that option.

While renaming, remember to rename it from python.exe to python39.exe, etc. After this hit start & search for PATH or Environment.
Under “User Environment Variables” make sure to double-click the option starting with “Path”. and then Click “Edit”. After you click New, Enter the path where the Python 3.9.9 version is installed.
Then Click “New” once more and enter the code - C:\python399\Scripts.
This will allow whisper to run from the command line.

CUDA:

If you already have Nvidia CUDA things will be quite simple for you.
If you don't have one click this link to download it.

PyTorch:

Visit the PyTorch Website here and choose Windows, Stable, Python, Pip, and Cuda-11.6.
Choose CPU if you don’t have CUDA, or if want CUDA support.

FFMPEG Tool:

It is used for interacting and processing audio.

Download FFMPEG here
After this, you can install Whisper into your system. Make sure to watch the video till the end for clear instructions on how to install it correctly.

FAQs About Whisper AI:

Is OpenAI Whisper Free?

OpenAI has recently released a new ASR, speech recognition AI called Whisper. Unlike GPT-3, and DALLE-2 Whisper AI is a free and open-source model. Whisper is a hi-tech automatic speech recognition model trained on more than 680,000 hours of multilingual data collected from the internet.

When was Whisper Released By OpenAI?

OpenAI launched Whisper which is a Multilingual ASR on September 23, 2022.

How Accurate is Whisper AI?

You can pretty much say that Whisper's accuracy is mind-blowing and you can experience it only after you try it yourself. OpenAi has already made a goal to achieve a Human level of accuracy and Robustness in the world of Transcription.

What Languages Does OpenAI Whisper Support?

Whisper AI can transcribe speech in more than 99 languages, and translate it into English from multiple other languages.

Page updated

Google Sites

Report abuse