Nadira Povey
🔉

Nadira Povey

If anyone has experience with Next-Gen Kaldi or backend engineering and wants to work part time on a project please a contact me at my gmail address at nadirapovey. I was thinking the job can be best for Master students.

My interests are Speech Processing, Text to Speech, Speech to Text, ML and AI.

Nadira Next-gen Kaldi

Nadira Kaldi

Other projects

Dan Kaldi

Date
Topics
video
Readings
June 5, 2022
June 7, 2022
June 8, 2022
June 9, 2022
June 12, 2022
Biased Language Models
June 13, 2022
June 14, 2022
LibriSpeech run.sh explained
June 15, 2022
Speech and Language Processing (3rd ed. draft) https://web.stanford.edu/~jurafsky/slp3/ Automatic Speech Recognition: A Deep Learning Approach. Amazon link https://tinyurl.com/dcvkncpw Google Scholar: https://scholar.google.com/scholar?hl=en&as_sdt=0%2C48&as_vis=1&q=automatic+speech+recognition+asr&btnG=
June 16,2022
LibriSpeech run.sh explained Part2

Dan Next-gen Kaldi

Date
Topics
video
Readings
June 6, 2022
June 11, 2022
June 10, 2022
July 23, 2022
powerpoint slides: https://shorturl.at/KMVY4
July 23, 2022
RNNT BAAI Conference
powerpoint slides: https://shorturl.at/KMVY4
July 23, 2022
powerpoint slides: https://shorturl.at/KMVY4
July 23, 2022
Next-gen Kaldi for Smart Phone Devices
powerpoint slides: https://shorturl.at/KMVY4
July 23, 2022
Next-gen Kaldi vs WeNet
powerpoint slides: https://shorturl.at/KMVY4
July 23, 2022
WFST to Integrate a Language Model
powerpoint slides: https://shorturl.at/KMVY4
July 23, 2022
Data Augmentation
powerpoint slides: https://shorturl.at/KMVY4
July 23, 2022
powerpoint slides: https://shorturl.at/KMVY4
July 23, 2022
Favorite Toolkit for Students
powerpoint slides: https://shorturl.at/KMVY4
July 23,2022
BAAI 2022 Conference Full Version
powerpoint slides: https://shorturl.at/KMVY4
September 2, 2022
Speech Recognition with weighted finite-state transducers: https://cs.nyu.edu/~mohri/pub/hbka.pdf What is HCLG.fst?: https://nadirapovey.blogspot.com/2021/12/what-is-hclgfst.html Icefall: https://github.com/k2-fsa/icefall

YouTube Videos I Liked

Automatic Speech Recognition - An Overview
Lecture 9 - Speech Recognition (ASR) [Andrew Senior]
MIT 6.S191: Automatic Speech Recognition
Lecture 12: End-to-End Models for Speech Processing [Stanford]
I Built a Personal Speech Recognition System for my AI Assistant
you need to learn Kubernetes RIGHT NOW!!

Papers

Datasets Collected for my Research

date
name the dataset
data description
#item
link
fie_name
blog posts
August 18,2022
talksatgoogle
ids for the YouTube videos where manual and audio captions are located
3577
talksatgoogle_levenshtein_score.txt

Install Kaldi: Ubuntu
Install Kaldi: Red Hat
LibriSpeech training
#1 Which model to start with? Aspire, WSJ, LibriSpeech or Mini LibriSpeech?
#2 Next Gen Kaldi for Beginners?
#3 X-Vectors vs I-Vectors
#4 Which dataset to use to benchmark the performance?
#5 Can we fine-tune ASR models in Kaldi by training it on more audio files?
#6 Which recipe from Icefall can I start with?
#7 Can we now prepare text, segments, wav.scp, utt2spk, and spk2uttfiles using Lhotse scripts from Next-gen Kaldi?
#8 What are biased language models?
#9 We trained a LibriSpech model using Kaldi scripts, what is the next step?What can we do now to improve its Word Error Rate?
#11 Recommended Books & Learning Material
#14 k2 installed but (ModuleNotFoundError: No module named 'k2')
#15 ModuleNotFoundError: No module named 'graphviz'
#16 Install lhotse
#17 Install Icefall
#18 prepare.sh
#20_1 Train Icefall model
#20_2 Files and Folders for icefall/egs/librispeech/ASR
#21 DeEsser For Free In Audacity!
What is BPE and lang_bpe_500?
Next-gen Kaldi: training and decoding for LibriSpeech dataset.
Next-gen Kaldi: Reworked Conformer Model
Next-gen Kaldi: what is it?
Next-gen Kaldi: training and decoding for LibriSpeech dataset.
Next-gen Kaldi: recent work with RNN-T
Next-gen Kaldi: My Sherpa Server Installation
How to submit PR on GitHub
BART: Abstractive Summarization