Using Wav2Vec 2.0 / HuBERT / WavLM and Whisper from HuggingFace with SpeechBrain

This tutorial describes how to combine (use and finetune) pretrained models coming from HuggingFace. Any wav2vec 2.0 / HuBERT / WavLM or Whisper model integrated to the transformers interface of HuggingFace can be then plugged to SpeechBrain to approach a speech-related task: automatic speech recognition, speaker recognition, spoken language understanding ...

Open in Google Colab

Recurrent Neural Networks and SpeechBrain

Recurrent Neural Networks (RNNs) offer a natural way to process sequences. This tutorial demonstrates how to use the SpeechBrain implementations of RNNs including LSTMs, GRU, RNN and LiGRU a specific recurrent cell designed for speech-related tasks. RNNs are at the core of many sequence to sequence models.

Open in Google Colab