UNIX Shell Cmds

Basic operations
… … , … , ~ … for parent directory . . means current directory, ~ means home
. = ls, ls is the list, which shows the names of the files in the directory.
ls ~	means the list of home directory
ls -l	show all the subfolders under this folder in detail
ls -l	is similar to ls -l
ls -l Documents/*.pdf	prints out all the pdf files in the Documents directory
ls -al	show details of the files in no directory (dash: files, f: folders, you can put files
pwd	print working directory prints the address of the directory you are in at the moment
cd , cd . /, cd ~	first directory
cd /… /… /… /…	cd is a change directory, which goes to a particular file address
cd /… /… /… /; ls	goes to a file address and then displays all file names
ls -	returns to the previous directory
cd …	go to the previous directory
cd … /… /… /… /…	exit back to the original n parent directory
q	quit
clear	clear code
← →	Toggle code back and forth, so you can easily use repetitive code multiple times.
history	View history, even after shutting down the machine.
echo	Print like other languages’ PRINT operations
wc -l fish	count the number of lines in the file that have fish in them
wc -c fish	count the number of files with fish in them
File Operations
touch A	create new file name A
ls -l A	View file A details
mv A B	change the name of A to B
mv ‘…/A.txt’ Documents/Books	Move the A text file from somewhere to the Books folder under Documents
mv ‘…/*txt’ Documents/Books	move all text files from a location to the Books folder under Documents
rm B	Delete file B directly without trash, this is permanently deleted
rm -i B	Ask before deleting B (recommended)
cat A.txt	concatenate/catenate (concatenate, make continuous) means you can run some files at the same time, here it will run and print out the txt file
more A.txt	print out the contents of the A text file completely, and then type “/filename” to find the file you are looking for
less A.txt	You can use the up and down arrows to navigate through the printed text or space to go up and down
source .bash_aliases	Run this bash_aliases file
nano A.txt	you can enter the edit mode of the document A, you can use Ctrl + S to save the modified file after the change
find / -name “A”	search for the file with the name “A”
find / -name “A” 2>/dev/null	Search for files with the file name “A” and only view the results as valid
grep E	find document E (recommended)
grep E /A/B/C	Regularize search for document E in the specified location
grep $USER	anchor the end of a line e.g. ‘grep$’ matches all lines ending with user
folder operations
mkdir A	create new folder A
mkdir A/C	create subfolder C of folder A
mv A B	rename folder A to B
rmdir B	If folder B is empty, you can remove it directly, it will be deleted directly without trash (not recommended)
rm -ir B	Delete the files in the folder one after another (recommended)
network operations
curl ‘(http://xiaos.site‘	c url = see url, will download the resource code of the web page (doesn’t work often)
curl -L ‘http://xiaos.site‘	follow redirect, will download the resource code of the web page (recommended)
curl -o robertzhangxiao.html-L ‘(http://xiaos.site‘	will directly download the html file from this site and save it
curl -L ‘(http://xiaos.site‘ in the vertical line grep fish	look in the downloaded file
variables
numbers=’XXX’	define variables without spaces in the equal sign
echo $numbers	Output variables
echo $LINES x $COLUMNS	output the row variable
echo $PATH	output path environment variable, here is to output the pragram address
Shell Scripts	file followed by sh
bin	is the binary
ls bin	all binary files, assuming it will output magic
bin/magic	run this binary file called magic
PATH=$PATH:/Users/student/bin	You can do the same if you type magic
Note: Not all sh files can be run on linux systems, but not on macs and win.
console
type PS1=’$’	will remove the header name
aliases ll=’ls -la’	will make the long code shorter, and then just type ll
aliases	View all aliases variables

1	`cp -r S0252/S0252_mic/* ./S0150/S0150_mic/`

Copy all the data from the “S0252/S0252_mic/“ directory to “/S0150/S0150_mic/“ directory. “-r” means copy directly without any warnings.

1	`head ...txt`

Just check the first few lines of the text.

1	`wc -l ...txt`

Check how many lines of the txt

du -sh

Check the size of the directory.

1	`du -h --max-depth=1 /.`

Check all the directory size under the current directory.

1	`cat ...txt \| tr '[:upper:]' '[:lower]'`

We can translate the upper case words in that file into lower case.

1	`cat ...txt \| tr '[:upper:]' '[:lower]' \| grep -o "[a-z]"`

Print the document letter by letter.

a
d
c
b

1	`cat ...txt \| tr '[:upper:]' '[:lower]' \| sort`

Print the document letter by letter and sort them.

a
b
c
d

1	`cat ...txt \| tr '[:upper:]' '[:lower]' \| sort \| uniq -c`

Print how many each letter occur.

1	`cat ...txt \| tr '[:upper:]' '[:lower]' \| sort \| uniq -c \| sort -nr ### here the "r" in "nr" means reverse the sorting, means from the up to the bottom and vice versa.`

Print how many each letter occur by the frequency.

Using Egrep to read the column:

There is a .lab speech file, which is labbeled as well:

Here the first column is the timming, second is the frequency, and the third is the labelled data.

0.1213 123 y
0.1232 111 uw
0.2113 110 eh
.............

So we now need to read all the third column information, we use egrep:

1 2	`egrep -h -o "[a-z]{1,2}$" *.lab ### we are looking for the lower case letters, $ means that they are happened at the end of the line`

This will print:

1
2
3

y
uw
eh

1	`egrep -h -o "[a-z]{1,2}$" *.lab \| sort \| uniq -c \| sort -nr`

This will print the each phone frequency in reverse order:

1
2
3

121 y
120 uw
110 eh

1	`ls \| wc -l`

Check how many files in one directory

1	`rm -rf ./`

Delete the current directory. No warrning will occur.

1	`cat ./.../*.txt`

Print all the .txt files in that directory.

1	`cat ./.../*.txt > ./text`

Print all the .txt file’s content in that text file

1	`python3 ./.../..py > ./text`

print the .py running results on text file.

file .wav :
Check the identity of the wav file size

Use mv to change the file name:

1	`mv ./../../.py ./../../.py`

We can use remove to change the file’s name.

1	`which ...`

Check where … is, the location of …

ll -lh

check all the files’ size

If there has a space in the beginning of the file’s name, we just need to delete it.

1	`sed 's\|^ \|\|'`

Adding a “_” in the middle of the file name:
eg. SPKID 09912 into SPKID_09912, g means globally.

1	`sed 's\| \|_\|g'`

1	`sed 's\|SPKID\|SPKID_\|'`

align two files:

1	`paste -d ' ' wav.scp wav_id > tmp.txt`

Delete each lines’ particular words by grep:

1	`pip freeze \| grep -v "@ the things you want to remove" > requirements.txt`

If we want to have a better shell scripting way like preparing those files, we can just do:

mkdir -p data/voxceleb1_train

# get all the .wav file path, eg. /data/voxceleb1/dev/id1231/...wav
find /data/voxceleb1/dev -name *.wav > data/voxceleb1_train/temp.lst

# generate the wav.scp, eg. id1231 data/voxceleb1/dev/id1231/...wav
# 1st. using split to cut "a" text with "/"
# 2st. cut the a[8] value with "." and save into the "b" 
awk '{split($0, a, "/"); {split(a[8], b,".")}; print a[6]"-"a[7]"-"b[1], $1}' data/voxceleb1_train/temp.list > data/voxceleb1_train/wav.scp

1
2
3

# 1. Delete ".wav" into " "
# 2. 
sed 's/\.wav//g' /data/the_text_we_need_to_handle.txt | awk '{if($1 = "1"){print $2}esle{print $2, $3}}' > processed.txt

Vim:

To the top:

GG

To the bottom:

gg

vim name+tab :

1	`auto-type the name`

auto sort:

:sort

check how many lines:

1	`:set number`

delete one line:

dd

search the “keyword”

1	`/"keyword"`

check the difference between two different files:

1	`vimdiff A.txt B.txt`

#Speech and Language Processing

Shell and Vim on Speech and Language Processing

http://xiaos.site/2022/07/11/Shell-and-Vim-on-Speech-and-Language-Processing/

Author

Xiao Zhang

Posted on

July 11, 2022

Licensed under

Code Quality Previous

Git Next