Next-gen Kaldi: My Sherpa Server Installation

My Conda Environment Information:
Step 1: Activate conda environment
Step 2: Install cuDNN only if you get an error saying you don’t have it.
Method 1
Method 2
Step 3: I had to install k2 in my base conda env.
Step 4: Install sherpa server
Decoding Method 1
Step 1: Start a streaming server.
Step 2: Open second Terminal to decode a test file
Decoding Method 2: Streaming
Step 1: Start New Terminal:
Step 2: Start New Terminal.
Step 3: Start a web page http://localhost:6008/

This blog was created using info below:

Embed GitHub
Follow the link below to install Sherpa server. https://k2-fsa.github.io/sherpa/python/installation/from-source.html#install-sherpa
Go to https://k2-fsa.github.io/k2/installation/conda.html to install all needed packages.
YouTube demonstration: https://www.youtube.com/watch?v=z7HgaZv5W0U&ab_channel=FangjunKuang

My Conda Environment Information:

This info is valuable when you start getting error messages and need to find the version of programs that you have.

(base) np@np-INTEL:/mnt/speech1/nadira/stt/sherpa$ python3 -m torch.utils.collect_env
Collecting environment information...
PyTorch version: 1.9.0
Is debug build: False
CUDA used to build PyTorch: 11.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.4 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: version 3.16.3
Libc version: glibc-2.31

Python version: 3.8 (64-bit runtime)
Python platform: Linux-5.15.0-46-generic-x86_64-with-glibc2.17
Is CUDA available: True
CUDA runtime version: 10.1.243
GPU models and configuration: 
GPU 0: NVIDIA GeForce RTX 3080
GPU 1: NVIDIA GeForce RTX 3080

Nvidia driver version: 510.85.02
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.22.4
[pip3] numpydoc==1.2
[pip3] torch==1.9.0
[pip3] torchaudio==0.9.0a0+33b2469
[conda] blas                      1.0                         mkl  
[conda] cudatoolkit               11.1.1               h6406543_8    conda-forge
[conda] k2                        1.15.1.dev20220419 cuda11.1_py3.8_torch1.9.0    k2-fsa
[conda] mkl                       2021.4.0           h06a4308_640  
[conda] mkl-service               2.4.0            py38h7f8727e_0  
[conda] mkl_fft                   1.3.1            py38hd3c417c_0  
[conda] mkl_random                1.2.2            py38h51133e4_0  
[conda] numpy                     1.21.5           py38he7a7128_1  
[conda] numpy-base                1.21.5           py38hf524024_1  
[conda] numpydoc                  1.2                pyhd3eb1b0_0  
[conda] pytorch                   1.9.0           py3.8_cuda11.1_cudnn8.0.5_0    pytorch
[conda] torchaudio                0.9.0                      py38    pytorch
(base) np@np-INTEL:/mnt/speech1/nadira/stt/sherpa$

Step 1: Activate conda environment

np@np-INTEL:/mnt/speech1/nadira/stt/sherpa/sherpa/bin/web$ source activate
(base) np@np-INTEL:/mnt/speech1/nadira/stt/sherpa/sherpa/bin/web

Step 2: Install cuDNN only if you get an error saying you don’t have it.

Method 1

conda install cudnn

Method 2

If above command doesn’t work try this:

My machine couldn’t locate cudnn.h file needed to install sherpa server so I had to do this step first.

Instructions: Installing cuDNN on Linux
Go to cuDNN Archive and download needed drivers.

(base) np@np-INTEL:~/Downloads$ sudo dpkg -i libcudnn8_8.0.5.39-1+cuda11.1_amd64.deb
(base) np@np-INTEL:~/Downloads$ sudo dpkg -i libcudnn8-dev_8.0.5.39-1+cuda11.1_amd64.deb
(base) np@np-INTEL:~/Downloads$ sudo dpkg -i libcudnn8-samples_8.0.5.39-1+cuda11.1_amd64.deb 

(base) np@np-INTEL:~$ sudo apt-get install libcudnn8=8.0.5.39-1+cuda11.1
(base) np@np-INTEL:~$ sudo apt-get install libcudnn8-dev=8.0.5.39-1+cuda11.1
(base) np@np-INTEL:~$ sudo apt-get install libcudnn8-samples=8.0.5.39-1+cuda11.1

Step 3: I had to install k2 in my base conda env.

This is my first time working in conda base environment and k2 was missing

(base) np@np-INTEL:/mnt/speech1/nadira/stt$ conda install -c k2-fsa -c pytorch -c conda-forge k2

Step 4: Install sherpa server

here is a link to instructions:

git clone https://github.com/k2-fsa/sherpa
cd sherpa

# Install the dependencies
pip install -r ./requirements.txt

# Install the C++ extension.
# Use one of the following methods:
#
# (1)
python3 setup.py install --verbose
#
# (2)
# pip install --verbose k2-sherpa

# To uninstall the C++ extension, use
# pip uninstall k2-sherpa

Decoding Method 1

Step 1: Start a streaming server.

(base) np@np-INTEL:/mnt/speech1/nadira/stt/sherpa/sherpa/bin/pruned_stateless_emformer_rnnt2$ ./streaming_server.py --port 6006 --max-batch-size 10 --max-wait-ms 5 --max-active-connections 1 --nn-pool-size 1 --nn-model-filename /mnt/speech1/nadira/stt/icefall-asr-librispeech-pruned-stateless-emformer-rnnt2-2022-06-01/exp/cpu_jit-epoch-39-avg-6-use-averaged-model-1-torch-1.6.0.pt --bpe-model-filename /mnt/speech1/nadira/stt/icefall-asr-librispeech-pruned-stateless-emformer-rnnt2-2022-06-01/data/lang_bpe_500/bpe.model
2022-09-17 06:03:42,776 INFO [streaming_server.py:537] {'decoding_method': 'greedy_search', 'beam': 10.0, 'num_paths': 200, 'num_active_paths': 4, 'nbest_scale': 0.5, 'temperature': 1.0, 'max_contexts': 8, 'max_states': 32, 'lang_dir': PosixPath('data/lang_bpe_500'), 'ngram_lm_scale': 0.01}
2022-09-17 06:03:42,776 INFO [streaming_server.py:540] {'endpoint_rule1_must_contain_nonsilence': False, 'endpoint_rule1_min_trailing_silence': 5.0, 'endpoint_rule1_min_utterance_length': 0.0, 'endpoint_rule2_must_contain_nonsilence': True, 'endpoint_rule2_min_trailing_silence': 2.0, 'endpoint_rule2_min_utterance_length': 0.0, 'endpoint_rule3_must_contain_nonsilence': False, 'endpoint_rule3_min_trailing_silence': 0.0, 'endpoint_rule3_min_utterance_length': 20.0}
2022-09-17 06:03:42,776 INFO [streaming_server.py:546] {'decoding_method': 'greedy_search', 'beam': 10.0, 'num_paths': 200, 'num_active_paths': 4, 'nbest_scale': 0.5, 'temperature': 1.0, 'max_contexts': 8, 'max_states': 32, 'lang_dir': PosixPath('data/lang_bpe_500'), 'ngram_lm_scale': 0.01, 'endpoint_rule1_must_contain_nonsilence': False, 'endpoint_rule1_min_trailing_silence': 5.0, 'endpoint_rule1_min_utterance_length': 0.0, 'endpoint_rule2_must_contain_nonsilence': True, 'endpoint_rule2_min_trailing_silence': 2.0, 'endpoint_rule2_min_utterance_length': 0.0, 'endpoint_rule3_must_contain_nonsilence': False, 'endpoint_rule3_min_trailing_silence': 0.0, 'endpoint_rule3_min_utterance_length': 20.0, 'port': 6006, 'nn_model_filename': '/mnt/speech1/nadira/stt/icefall-asr-librispeech-pruned-stateless-emformer-rnnt2-2022-06-01/exp/cpu_jit-epoch-39-avg-6-use-averaged-model-1-torch-1.6.0.pt', 'bpe_model_filename': '/mnt/speech1/nadira/stt/icefall-asr-librispeech-pruned-stateless-emformer-rnnt2-2022-06-01/data/lang_bpe_500/bpe.model', 'token_filename': None, 'decode_chunk_size': 8, 'decode_left_context': 32, 'decode_right_context': 2, 'nn_pool_size': 1, 'max_batch_size': 10, 'max_wait_ms': 5.0, 'max_message_size': 1048576, 'max_queue_size': 32, 'max_active_connections': 1}
2022-09-17 06:03:42,814 INFO [streaming_server.py:237] Using device: cuda:0
2022-09-17 06:03:45,785 INFO [streaming_server.py:309] Warmup start
2022-09-17 06:03:46,506 INFO [streaming_server.py:323] Warmup done
2022-09-17 06:03:46,507 INFO [server.py:707] server listening on [::]:6006
2022-09-17 06:03:46,507 INFO [server.py:707] server listening on 0.0.0.0:6006
2022-09-17 06:04:35,178 INFO [server.py:642] connection open
2022-09-17 06:04:35,178 INFO [streaming_server.py:442] Connected: ('127.0.0.1', 50724). Number of connections: 1/1
2022-09-17 06:04:35,182 INFO [streaming_server.py:426] Disconnected: ('127.0.0.1', 50724). Number of connections: 0/1

If you get Error: “OSError: [Errno 98] error while attempting to bind on address ('0.0.0.0', 6006): address already in use” try this:

sudo kill -9 $(sudo lsof -ti :6006)

Step 2: Open second Terminal to decode a test file

(base) np@np-INTEL:/mnt/speech1/nadira/stt/sherpa/sherpa/bin$ ./pruned_stateless_emformer_rnnt2/streaming_client.py --server-addr localhost --server-port 6006 /mnt/speech1/nadira/stt/icefall-asr-librispeech-pruned-stateless-emformer-rnnt2-2022-06-01/test_wavs/1221-135766-0001.wav

Screenshot of 1221-135766-0001.wav file decoding:

Decoding Method 2: Streaming

Step 1: Start New Terminal:

(base) np@np-INTEL:/mnt/speech1/nadira/stt/sherpa/sherpa/bin/pruned_stateless_emformer_rnnt2$ ./streaming_server.py --port 6006 --max-batch-size 10 --max-wait-ms 5 --max-active-connections 1 --nn-pool-size 1 --nn-model-filename /mnt/speech1/nadira/stt/icefall-asr-librispeech-pruned-stateless-emformer-rnnt2-2022-06-01/exp/cpu_jit-epoch-39-avg-6-use-averaged-model-1-torch-1.6.0.pt --bpe-model-filename /mnt/speech1/nadira/stt/icefall-asr-librispeech-pruned-stateless-emformer-rnnt2-2022-06-01/data/lang_bpe_500/bpe.model

If you get Error: “OSError: [Errno 98] error while attempting to bind on address ('0.0.0.0', 6006): address already in use” try this:

sudo kill -9 $(sudo lsof -ti :6006)

Step 2: Start New Terminal.

(base) np@np-INTEL:/mnt/speech1/nadira/stt/sherpa/sherpa/bin/web$ python3 -m http.server 6008

if errors out try:

sudo kill -9 $(sudo lsof -ti :6008)

Step 3: Start a web page http://localhost:6008/

Choose the mode and start testing k2 models.