Next-gen Kaldi: My Sherpa Server Installation

My Conda Environment Information:

This info is valuable when you start getting error messages and need to find the version of programs that you have.

(base) np@np-INTEL:/mnt/speech1/nadira/stt/sherpa$ python3 -m torch.utils.collect_env
Collecting environment information...
PyTorch version: 1.9.0
Is debug build: False
CUDA used to build PyTorch: 11.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.4 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: version 3.16.3
Libc version: glibc-2.31

Python version: 3.8 (64-bit runtime)
Python platform: Linux-5.15.0-46-generic-x86_64-with-glibc2.17
Is CUDA available: True
CUDA runtime version: 10.1.243
GPU models and configuration: 
GPU 0: NVIDIA GeForce RTX 3080
GPU 1: NVIDIA GeForce RTX 3080

Nvidia driver version: 510.85.02
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.22.4
[pip3] numpydoc==1.2
[pip3] torch==1.9.0
[pip3] torchaudio==0.9.0a0+33b2469
[conda] blas                      1.0                         mkl  
[conda] cudatoolkit               11.1.1               h6406543_8    conda-forge
[conda] k2                        1.15.1.dev20220419 cuda11.1_py3.8_torch1.9.0    k2-fsa
[conda] mkl                       2021.4.0           h06a4308_640  
[conda] mkl-service               2.4.0            py38h7f8727e_0  
[conda] mkl_fft                   1.3.1            py38hd3c417c_0  
[conda] mkl_random                1.2.2            py38h51133e4_0  
[conda] numpy                     1.21.5           py38he7a7128_1  
[conda] numpy-base                1.21.5           py38hf524024_1  
[conda] numpydoc                  1.2                pyhd3eb1b0_0  
[conda] pytorch                   1.9.0           py3.8_cuda11.1_cudnn8.0.5_0    pytorch
[conda] torchaudio                0.9.0                      py38    pytorch
(base) np@np-INTEL:/mnt/speech1/nadira/stt/sherpa$

Step 1: Activate conda environment

np@np-INTEL:/mnt/speech1/nadira/stt/sherpa/sherpa/bin/web$ source activate
(base) np@np-INTEL:/mnt/speech1/nadira/stt/sherpa/sherpa/bin/web

Step 2: Install cuDNN only if you get an error saying you don’t have it.

Method 1

conda install cudnn

Method 2

If above command doesn’t work try this:

My machine couldn’t locate cudnn.h file needed to install sherpa server so I had to do this step first.

(base) np@np-INTEL:~/Downloads$ sudo dpkg -i libcudnn8_8.0.5.39-1+cuda11.1_amd64.deb
(base) np@np-INTEL:~/Downloads$ sudo dpkg -i libcudnn8-dev_8.0.5.39-1+cuda11.1_amd64.deb
(base) np@np-INTEL:~/Downloads$ sudo dpkg -i libcudnn8-samples_8.0.5.39-1+cuda11.1_amd64.deb 

(base) np@np-INTEL:~$ sudo apt-get install libcudnn8=
(base) np@np-INTEL:~$ sudo apt-get install libcudnn8-dev=
(base) np@np-INTEL:~$ sudo apt-get install libcudnn8-samples=

Step 3: I had to install k2 in my base conda env.

This is my first time working in conda base environment and k2 was missing

(base) np@np-INTEL:/mnt/speech1/nadira/stt$ conda install -c k2-fsa -c pytorch -c conda-forge k2

Step 4: Install sherpa server

here is a link to instructions:

git clone
cd sherpa

# Install the dependencies
pip install -r ./requirements.txt

# Install the C++ extension.
# Use one of the following methods:
# (1)
python3 install --verbose
# (2)
# pip install --verbose k2-sherpa

# To uninstall the C++ extension, use
# pip uninstall k2-sherpa

Decoding Method 1

Step 1: Start a streaming server.

(base) np@np-INTEL:/mnt/speech1/nadira/stt/sherpa/sherpa/bin/pruned_stateless_emformer_rnnt2$ ./ --port 6006 --max-batch-size 10 --max-wait-ms 5 --max-active-connections 1 --nn-pool-size 1 --nn-model-filename /mnt/speech1/nadira/stt/icefall-asr-librispeech-pruned-stateless-emformer-rnnt2-2022-06-01/exp/ --bpe-model-filename /mnt/speech1/nadira/stt/icefall-asr-librispeech-pruned-stateless-emformer-rnnt2-2022-06-01/data/lang_bpe_500/bpe.model
2022-09-17 06:03:42,776 INFO [] {'decoding_method': 'greedy_search', 'beam': 10.0, 'num_paths': 200, 'num_active_paths': 4, 'nbest_scale': 0.5, 'temperature': 1.0, 'max_contexts': 8, 'max_states': 32, 'lang_dir': PosixPath('data/lang_bpe_500'), 'ngram_lm_scale': 0.01}
2022-09-17 06:03:42,776 INFO [] {'endpoint_rule1_must_contain_nonsilence': False, 'endpoint_rule1_min_trailing_silence': 5.0, 'endpoint_rule1_min_utterance_length': 0.0, 'endpoint_rule2_must_contain_nonsilence': True, 'endpoint_rule2_min_trailing_silence': 2.0, 'endpoint_rule2_min_utterance_length': 0.0, 'endpoint_rule3_must_contain_nonsilence': False, 'endpoint_rule3_min_trailing_silence': 0.0, 'endpoint_rule3_min_utterance_length': 20.0}
2022-09-17 06:03:42,776 INFO [] {'decoding_method': 'greedy_search', 'beam': 10.0, 'num_paths': 200, 'num_active_paths': 4, 'nbest_scale': 0.5, 'temperature': 1.0, 'max_contexts': 8, 'max_states': 32, 'lang_dir': PosixPath('data/lang_bpe_500'), 'ngram_lm_scale': 0.01, 'endpoint_rule1_must_contain_nonsilence': False, 'endpoint_rule1_min_trailing_silence': 5.0, 'endpoint_rule1_min_utterance_length': 0.0, 'endpoint_rule2_must_contain_nonsilence': True, 'endpoint_rule2_min_trailing_silence': 2.0, 'endpoint_rule2_min_utterance_length': 0.0, 'endpoint_rule3_must_contain_nonsilence': False, 'endpoint_rule3_min_trailing_silence': 0.0, 'endpoint_rule3_min_utterance_length': 20.0, 'port': 6006, 'nn_model_filename': '/mnt/speech1/nadira/stt/icefall-asr-librispeech-pruned-stateless-emformer-rnnt2-2022-06-01/exp/', 'bpe_model_filename': '/mnt/speech1/nadira/stt/icefall-asr-librispeech-pruned-stateless-emformer-rnnt2-2022-06-01/data/lang_bpe_500/bpe.model', 'token_filename': None, 'decode_chunk_size': 8, 'decode_left_context': 32, 'decode_right_context': 2, 'nn_pool_size': 1, 'max_batch_size': 10, 'max_wait_ms': 5.0, 'max_message_size': 1048576, 'max_queue_size': 32, 'max_active_connections': 1}
2022-09-17 06:03:42,814 INFO [] Using device: cuda:0
2022-09-17 06:03:45,785 INFO [] Warmup start
2022-09-17 06:03:46,506 INFO [] Warmup done
2022-09-17 06:03:46,507 INFO [] server listening on [::]:6006
2022-09-17 06:03:46,507 INFO [] server listening on
2022-09-17 06:04:35,178 INFO [] connection open
2022-09-17 06:04:35,178 INFO [] Connected: ('', 50724). Number of connections: 1/1
2022-09-17 06:04:35,182 INFO [] Disconnected: ('', 50724). Number of connections: 0/1

If you get Error: “OSError: [Errno 98] error while attempting to bind on address ('', 6006): address already in use” try this:

sudo kill -9 $(sudo lsof -ti :6006)

Step 2: Open second Terminal to decode a test file

(base) np@np-INTEL:/mnt/speech1/nadira/stt/sherpa/sherpa/bin$ ./pruned_stateless_emformer_rnnt2/ --server-addr localhost --server-port 6006 /mnt/speech1/nadira/stt/icefall-asr-librispeech-pruned-stateless-emformer-rnnt2-2022-06-01/test_wavs/1221-135766-0001.wav

Screenshot of 1221-135766-0001.wav file decoding:


Decoding Method 2: Streaming

Step 1: Start New Terminal:

(base) np@np-INTEL:/mnt/speech1/nadira/stt/sherpa/sherpa/bin/pruned_stateless_emformer_rnnt2$ ./ --port 6006 --max-batch-size 10 --max-wait-ms 5 --max-active-connections 1 --nn-pool-size 1 --nn-model-filename /mnt/speech1/nadira/stt/icefall-asr-librispeech-pruned-stateless-emformer-rnnt2-2022-06-01/exp/ --bpe-model-filename /mnt/speech1/nadira/stt/icefall-asr-librispeech-pruned-stateless-emformer-rnnt2-2022-06-01/data/lang_bpe_500/bpe.model

If you get Error: “OSError: [Errno 98] error while attempting to bind on address ('', 6006): address already in use” try this:

sudo kill -9 $(sudo lsof -ti :6006)

Step 2: Start New Terminal.

(base) np@np-INTEL:/mnt/speech1/nadira/stt/sherpa/sherpa/bin/web$ python3 -m http.server 6008

if errors out try:

sudo kill -9 $(sudo lsof -ti :6008)

Step 3: Start a web page http://localhost:6008/

Choose the mode and start testing k2 models.
