#20_1 Train Icefall model

To train the model check the RESULTS.md in icefall/egs/librispeech/ASR folder, I picked the medium sized model to train.

  1. I created the script medium_librispeech.sh.
  2. Comment out export statement if you are using all GPUs. As I want to use all my GPUs I don’t need to export statement,
  3. Change world-size from 8 to 2. I need to change the world-size as I only have 2GPUs

np@np-INTEL:/mnt/speech1/nadira/stt/icefall/egs/librispeech/ASR$ pico medium_librispeech.sh

# using all, don't need to export

./pruned_transducer_stateless5/train.py \
  --world-size 2 \
  --num-epochs 40 \
  --start-epoch 1 \
  --full-libri 1 \
  --exp-dir pruned_transducer_stateless5/exp-M \
  --max-duration 300 \
  --use-fp16 0 \
  --num-encoder-layers 18 \
  --dim-feedforward 1024 \
  --nhead 4 \
  --encoder-dim 256 \
  --decoder-dim 512 \
  --joiner-dim 512

np@np-INTEL:/mnt/speech1/nadira/stt/icefall/egs/librispeech/ASR$ pico RESULTS.md

At the end we should get the results shown below.

#### Medium

Number of model parameters 30896748 (i.e, 30.9 M).

|                                     | test-clean | test-other | comment                                 |
| greedy search (max sym per frame 1) | 2.88       | 6.69       | --epoch 39 --avg 17  --max-duration 600 |
| modified beam search                | 2.83       | 6.59       | --epoch 39 --avg 17  --max-duration 600 |
| fast beam search                    | 2.83       | 6.61       | --epoch 39 --avg 17 --max-duration 600  |

The training commands are:

export CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7"

./pruned_transducer_stateless5/train.py \
  --world-size 8 \
  --num-epochs 40 \
  --start-epoch 0 \
  --full-libri 1 \
  --exp-dir pruned_transducer_stateless5/exp-M \
  --max-duration 300 \
  --use-fp16 0 \
  --num-encoder-layers 18 \
  --dim-feedforward 1024 \
  --nhead 4 \
  --encoder-dim 256 \
  --decoder-dim 512 \
  --joiner-dim 512