Popular Pretrained Models for ASR

Pretrained models are typically trained on diverse datasets, making their performance less optimized for specific domain tasks. In such cases, adapting the model to the complications of the target domain through fine-tuning becomes crucial for achieving task-specific proficiency.

Model	Description
Whisper	Whisper ASR is an automatic speech recognition system developed by OpenAI. It utilizes a Seq2Seq model with a combination of convolutional and recurrent neural network layers.
Listen, Attend, and Spell (LAS)	LAS is a Seq2Seq model with an attention mechanism designed for automatic speech recognition. It has been used successfully in various ASR applications.
Conformer	Conformer is an attention-based sequence-to-sequence model that combines convolutional and transformer layers. It has shown strong performance in various ASR tasks.
DeepSpeech	DeepSpeech, developed by Baidu Research, is an end-to-end automatic speech recognition system based on deep learning. It uses a Seq2Seq model with a Connectionist Temporal Classification (CTC) loss.

Automatic Speech Recognition using Whisper

Automatic Speech Recognition (ASR) can be simplified as artificial intelligence transforming spoken language into text. Its historical journey dates back to a time when developing ASR posed significant challenges. Addressing diverse factors such as variations in voices, accents, background noise, and speech patterns proved to be formidable obstacles.

Popular Pretrained Models for ASR

Automatic Speech Recognition using Whisper

Categories

Contact US

Popular Pretrained Models for ASR

Automatic Speech Recognition using Whisper

Similar Reads

Categories

Contact US