If anyone has experience with Next-Gen Kaldi or backend engineering and wants to work part time on a project please a contact me at my gmail address at nadirapovey. I was thinking the job can be best for Master students.
My interests are Speech Processing, Text to Speech, Speech to Text, ML and AI.
Nadira Next-gen Kaldi
Date | Topics | video | Readings |
July 2, 2022 | |||
July 3, 2022 | |||
July 4, 2022 | |||
July 6, 2022 | |||
July 7, 2022 | |||
July 8, 2022 | |||
July 20, 2022 | Hugging Face | ||
August 6, 2022 | Hugging Face | ||
September 7, 2022 | |||
September 18,2022 | |||
Dec 15,2022 |
Nadira Kaldi
Date | Topics | video | PowerPoint | Readings |
Jan 10, 2022 | Downloading Kaldi: http://kaldi-asr.org/doc/install.html | |||
Jan 10, 2022 | Downloading Kaldi: http://kaldi-asr.org/doc/install.html | |||
Jan 10, 2022 | LibriSpeech training script: https://github.com/kaldi-asr/kaldi/blob/master/egs/librispeech/s5/run.sh
Excellent folder structure visualization: https://eleanorchodroff.com/tutorial/kaldi/training-acoustic-models.html#create-files-for-conf | |||
Sep 19, 2022 |
Other projects
Data | Topic | Links |
July 1, 2022 | Learning wav2vec2.0 | |
July 29 |
Dan Kaldi
Date | Topics | video | Readings |
June 5, 2022 | Examples included with Kaldi: https://kaldi-asr.org/doc/examples.html,
Mini LibriSpeech: https://github.com/kaldi-asr/kaldi/tree/master/egs/mini_librispeech | ||
June 7, 2022 | SRE16 xvectors: https://github.com/kaldi-asr/kaldi/tree/master/egs/sre16/v2
SRE16 Xvector Model: https://kaldi-asr.org/models/m3
X-Vectors: Robust DNN Embeddings for Speaker Recognition: https://ieeexplore.ieee.org/abstract/document/8461375
OnlineIvectorFeature Class Reference: https://kaldi-asr.org/doc/classkaldi_1_1OnlineIvectorFeature.html#af7c4234c6b1d5d807dbb4292cf36b98c
GBO notes: i-vectors and x-vectors: https://desh2608.github.io/2022-04-07-gbo-ivectors/
| ||
June 8, 2022 | |||
June 9, 2022 | |||
June 12, 2022 | Biased Language Models | ||
June 13, 2022 | |||
June 14, 2022 | LibriSpeech run.sh explained | ||
June 15, 2022 | Speech and Language Processing (3rd ed. draft) https://web.stanford.edu/~jurafsky/slp3/
Automatic Speech Recognition: A Deep Learning Approach. Amazon link https://tinyurl.com/dcvkncpw
Google Scholar: https://scholar.google.com/scholar?hl=en&as_sdt=0%2C48&as_vis=1&q=automatic+speech+recognition+asr&btnG= | ||
June 16,2022 | LibriSpeech run.sh explained Part2 |
Dan Next-gen Kaldi
Date | Topics | video | Readings |
June 6, 2022 | SRE16 xvectors: https://github.com/kaldi-asr/kaldi/tree/master/egs/sre16/v2
SRE16 Xvector Model: https://kaldi-asr.org/models/m3
X-Vectors: Robust DNN Embeddings for Speaker Recognition: https://ieeexplore.ieee.org/abstract/document/8461375
OnlineIvectorFeature Class Reference: https://kaldi-asr.org/doc/classkaldi_1_1OnlineIvectorFeature.html#af7c4234c6b1d5d807dbb4292cf36b98c
GBO notes: i-vectors and x-vectors: https://desh2608.github.io/2022-04-07-gbo-ivectors/ | ||
June 11, 2022 | LibriSpeech: https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/ASR
TIMIT: https://github.com/k2-fsa/icefall/tree/master/egs/timit/ASR
Icefall: https://github.com/k2-fsa/icefall/tree/master/egs
TIMIT dataset https://lhotse.readthedocs.io/en/latest/cli.html?highlight=TIMIT#lhotse-download-timit | ||
June 10, 2022 | Lhotse: https://lhotse.readthedocs.io/en/latest/getting-started.html
lhotse.kaldi.export_to_kaldi: https://lhotse.readthedocs.io/en/latest/kaldi.html | ||
July 23, 2022 | powerpoint slides: https://shorturl.at/KMVY4 | ||
July 23, 2022 | RNNT BAAI Conference | powerpoint slides: https://shorturl.at/KMVY4 | |
July 23, 2022 | powerpoint slides: https://shorturl.at/KMVY4 | ||
July 23, 2022 | Next-gen Kaldi for Smart Phone Devices | powerpoint slides: https://shorturl.at/KMVY4 | |
July 23, 2022 | Next-gen Kaldi vs WeNet | powerpoint slides: https://shorturl.at/KMVY4 | |
July 23, 2022 | WFST to Integrate a Language Model | powerpoint slides: https://shorturl.at/KMVY4 | |
July 23, 2022 | Data Augmentation | powerpoint slides: https://shorturl.at/KMVY4 | |
July 23, 2022 | powerpoint slides: https://shorturl.at/KMVY4 | ||
July 23, 2022 | Favorite Toolkit for Students | powerpoint slides: https://shorturl.at/KMVY4 | |
July 23,2022 | BAAI 2022 Conference Full Version | powerpoint slides: https://shorturl.at/KMVY4 | |
September 2, 2022 | Speech Recognition with weighted finite-state transducers: https://cs.nyu.edu/~mohri/pub/hbka.pdf
What is HCLG.fst?: https://nadirapovey.blogspot.com/2021/12/what-is-hclgfst.html
Icefall: https://github.com/k2-fsa/icefall |
YouTube Videos I Liked
Automatic Speech Recognition - An Overview | |
Lecture 9 - Speech Recognition (ASR) [Andrew Senior] | |
MIT 6.S191: Automatic Speech Recognition | |
Lecture 12: End-to-End Models for Speech Processing [Stanford] | |
I Built a Personal Speech Recognition System for my AI Assistant | |
you need to learn Kubernetes RIGHT NOW!! |
Papers
Paper | link | Date |
26 Jun 2022 |
Datasets Collected for my Research
date | name the dataset | data description | #item | link | fie_name | blog posts |
August 18,2022 | talksatgoogle | ids for the YouTube videos where manual and audio captions are located | 3577 | talksatgoogle_levenshtein_score.txt | ||
Ask questions at: https://github.com/npovey/speech/discussions