- Step 1: Choose an instance on AWS
- Step 2: Download Kaldi from GitHub
- Step 3: Read the Kaldi installation instructions and follow Option 1
- Step 3: Go to tools/ and follow INSTALL instructions there.
- Step 4: go to src/ and follow INSTALL instructions there.
Kaldi installation using AWS servers.
Step 1: Choose an instance on AWS
Select the g4dn.2xlarge
instance as it has 8 vCPUs and 1 GPU, and its 75 cents per hour, so make sure to pause the instance when you’re not using it. Need 10 days of training. For Step 3, select a specific subnet. This is optional since it defaults to a random region. For Step 4, disable the “Delete on Termination” setting and change the Root’s size to 600 GB. From personal experience, 400GB isn’t going to be enough. Then, you can skip to step 7, review the instance settings, and launch the instance. I’m creating a key pair but you can use an existing one.
Step 2: Download Kaldi from GitHub
Instructions page http://kaldi-asr.org/doc/install.html .
git clone https://github.com/kaldi-asr/kaldi.git kaldi --origin upstream
cd kaldi
git pull
Step 3: Read the Kaldi installation instructions and follow Option 1
https://github.com/kaldi-asr/kaldi/blob/master/INSTALL
$ nano INSTALL
This is the official Kaldi INSTALL. Look also at INSTALL.md for the git mirror installation.
[Option 1 in the following does not apply to native Windows install, see windows/INSTALL or following Option 2]
Option 1 (bash + makefile):
Steps:
(1)
go to tools/ and follow INSTALL instructions there.
(2)
go to src/ and follow INSTALL instructions there.
Option 2 (cmake):
Go to cmake/ and follow INSTALL.md instructions there.
Note, it may not be well tested and some features are missing currently.
Step 3: Go to tools/ and follow INSTALL instructions there.
https://github.com/kaldi-asr/kaldi/blob/master/tools/INSTALL
Check for dependencies using one of the provided lines and install anything that isn’t downloaded yet.
CXX=g++-4.8 extras/check_dependencies.sh
Installed the following packages but you might need more, keep checking for dependencies using the line above.
sudo apt-get install g++ sox subversion
sudo apt-get install g++
extras/install_mkl.sh
make -j 8
When that’s done running, I saw the warning that IRSTLM is not installed, so I’m running the provided line to have it installed.
extras/install_irstlm.sh
If you recall the original installation steps we found, you’ll remember that when we’re done in the tools folder we should head to the source folder.
Step 4: go to src/ and follow INSTALL instructions there.
[ubuntu@ip-172-31-6-113 src]$ nano INSTALL
cd src
[ubuntu@ip-172-31-6-113 src]$ nano INSTALL
./configure --shared
make depend -j 8
make -j 8
You’re Done!
Transcript for tutorial on YouTube:
Today we’re going to set up an AWS instance and install Kaldi on it. First, open your AWS account and find the EC2 instances page. Click launch instance, scroll down to the “Deep Learning AMI (Ubuntu 18.04)” and select it. For Step 2, select the g4dn.2xlarge instance as it has 8 vCPUs and 1 GPU, and its 75 cents per hour, so make sure to pause the instance when you’re not using it. For Step 3, select a specific subnet. This is optional since it defaults to a random region. For Step 4, disable the “Delete on Termination” setting and change the Root’s size to 600 GB. From personal experience, 400GB isn’t going to be enough. Then, you can skip to step 7, review the instance settings, and launch the instance. I’m creating a key pair but you can use an existing one.
Now its time to open your terminal and login into your AWS instance. The address you’ll be ssh-ing into is ubuntu “at” your instance’s public IP address. If you’re using a new key .pem file, make sure to change its privacy settings by using the command “chmod 400” with your file name. Once you successfully ssh into your instance, it is time to install Kaldi.
The initial downloading steps can be found on the official Kaldi webpage. Just copy and paste the first two commands on this page into your terminal and you should be set. Next, open and read the INSTALL file in the Kaldi directory, and follow Option 1. This means you should change to the tools directory and follow the directions in the INSTALL file there. Check for dependencies using one of the provided lines and install anything that isn’t downloaded yet. There’s a bug for one dependency where even if you download it, the check dependencies command will still say its uninstalled. You can just ignore this warning if you have already installed it. Now we're going to run make with the last option in the installation instruction, since it uses multiple CPUs. When that’s done running, I saw the warning that IRSTLM is not installed, so I’m running the provided line to have it installed.
If you recall the original installation steps we found, you’ll remember that when we’re done in the tools folder we should head to the source folder. Here we can open the installation file and follow the three simple commands.
And that was it! You’re all done installing Kaldi. Thanks for watching, and hopefully his was helpful.
history.txt
1 LS
2 ls
3 tree -l
4 tree -L
5 get-apt install tree
6 apt-get install tree
7 ls
8 tree -
9 tree -L 2
10 exit
11 ls
12 tree -L 2
13 sudo apt install tree
14 tree -L 2
15 git clone https://github.com/kaldi-asr/kaldi.git kaldi --origin upstream
16 cd kaldi
17 ls
18 nano INSTALL
19 cd tools/
20 nano INSTALL
21 CXX=g++-4.8 extras/check_dependencies.sh
22 sudo apt-get install g++ sox subversion
23 CXX=g++-4.8 extras/check_dependencies.sh
24 sudo apt-get install g++
25 CXX=g++-4.8 extras/check_dependencies.sh
26 extras/install_mkl.sh
27 CXX=g++-4.8 extras/check_dependencies.sh
28 nano INSTALL
29 make -j 8
30 extras/install_irstlm.sh
31 ls
32 cd ..
33 cd src
34 ls
35 cd ..
36 nano INSTALL
37 cd src/
38 ls
39 nano INSTALL
40 ./configure --shared
41 nano INSTALL
42 make depend -j 8
43 nano INSTALL
44 make -j 8
45 cd ..
46 tree -L 2
47 tree -L 1
48 tree -L 2
49 df -hT
50 lsblk
51 ls
52 cd egs
53 ls
54 cd librispeech/
55 ls
56 cd s5
57 ls
58 sudo apt install flac
59 ls
60 nano cmd.sh
61 ls
62 cp run.sh run_edited.sh
63 ls
64 nano run_edited.sh
65 nvidia-smi
66 nvidia-smi -c 3
67 sudo nvidia-smi -c 3
68 nano run_edited.sh
69 nano local/chain/run_tdnn.sh
70 screen -S libri
71 top
72 history > history.txt
Comments:
https://github.com/npovey/speech/discussions/categories/announcements