#7 Can we now prepare text, segments, wav.scp, utt2spk, and spk2utt files using Lhotse scripts from Next-gen Kaldi?

https://youtu.be/zsaRpHGrRlA

Answer: No

Currently to train with Kaldi we need to create text, segments, wav.scp, utt2spk, and spk2utt files.

Can we now prepare these files using Lhotse scripts from Next-gen Kaldi?

We have to know how to prepare data in Lhotse style, then we can convert from Lhotse style to Kaldi style using Lhotse function called lhotse.kaldi.export_to_kaldi(recordings, supervisions, output_dir, map_underscores_to=None, prefix_spk_id=False)

We can prepare files in Lhotse format, then we can export Lhotse format to current Kaldi format.

We can’t just get those 5 needed files from Lhotse scripts.

image
#7 data prep youtube auto transcript