Before learning awk, we should have learned sed, grep, tr, cut and other commands. These commands are all for the convenience of text and data processing under Linux, but we will find that many times these commands cannot completely meet our needs at once. Many times we need to use pipe symbols in combination with these commands. Today I will introduce a command awk to you, which can well solve our needs for text and data processing, allowing us to solve many problems with one command. 1. Introduction to awk command Awk is known as one of the three musketeers of text processing. Its name comes from the first letters of the surnames of its founders Alfred Aho, Peter Weinberger, and Brian Kernighan. In fact, AWK does have its own language: the AWK Programming Language, which its three creators have formally defined as a "pattern scanning and processing language." It allows you to create short programs that read input files, sort data, process data, perform calculations on the input, and generate reports, among countless other functions. 2. awk command format and options Grammatical form awk [options] 'script' var=value file(s) Common command options -F fs fs specifies the input separator, fs can be a string or a regular expression, such as -F: 3. The principle of awk Step 1: Execute the statements in the BEGIN{ commands } statement block; The END block is executed after awk has read all the lines from the input stream. For example, information summarization such as printing the analysis results of all lines is completed in the END block. It is also an optional block. The common commands in the pattern block are the most important part, and they are also optional. If the pattern statement block is not provided, { print } is executed by default, that is, each line read is printed, and the statement block will be executed for each line read by awk. 4. Basic usage of awk There are three ways to call awk 1. Command line method Among them, commands are real awk commands, and [-F field separator] is optional. input-file(s) are the files to be processed. 2. Shell script method An awk script usually consists of three parts: a BEGIN statement block, a general statement block that can use pattern matching, and an END statement block. These three parts are optional. Either part need not appear in the script, which is usually enclosed in single or double quotes, for example: awk 'BEGIN{ i=0 } { i++ } END{ print i }' filename awk "BEGIN{ i=0 } { i++ } END{ print i }" filename 3. Insert all awk commands into a separate file and then call awk -f awk-script-file input-file(s) The -f option loads the awk script in awk-script-file, and input-file(s) is the same as the command line method above. [root@localhost ~]# awk '{print $0}' /etc/passwd root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin adm:x:3:4:adm:/var/adm:/sbin/nologin lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin sync:x:5:0:sync:/sbin:/bin/sync shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown halt:x:7:0:halt:/sbin:/sbin/halt ......................................................................... [root@localhost ~]# echo 123|awk '{print "hello,awk"}' hello,awk [root@localhost ~]# awk '{print "hi"}' /etc/passwd hi hi hi hi hi hi hi hi hi .........................................................................
The awk workflow is as follows: read a record separated by a '\n' newline character, then divide the record into fields according to the specified field separator, fill the fields, $0 represents all fields, $1 represents the first field, and $n represents the nth field. The default domain separator is "blank" or "[tab] key", so $1 represents the logged-in user, $3 represents the logged-in user IP, and so on. like Print all usernames under /etc/passwd [root@localhost ~]# awk -F: '{print $1}' /etc/passwd root bin daemon adm ........................................................................ [root@localhost ~]# awk -F: '{print $1,$3}' /etc/passwd root 0 bin 1 daemon 2 ........................................................................ [root@localhost ~]# awk -F: '{print "username: " $1 "\t\tuid: "$3}' /etc/passwd username: root uid: 0 username: bin uid: 1 username: daemon uid: 2 ........................................................................ 5. awk built-in variables
Example [root@localhost ~]# echo -e "line1 f2 f3\nline2 f4 f5\nline3 f6 f7" | awk '{print "Line No:"NR", No of fields:"NF, "$0="$0, "$1="$1, "$2="$2, "$3="$3}' Line No:1, No of fields:3 $0=line1 f2 f3 $1=line1 $2=f2 $3=f3 Line No:2, No of fields:3 $0=line2 f4 f5 $1=line2 $2=f4 $3=f5 Line No:3, No of fields:3 $0=line3 f6 f7 $1=line3 $2=f6 $3=f7 Use print $NF to print the last field in a line, use $(NF-1) to print the second to last field, and so on: [root@localhost ~]# echo -e "line1 f2 f3\n line2 f4 f5" | awk '{print $NF}' f3 f5 [root@localhost ~]# echo -e "line1 f2 f3\n line2 f4 f5" | awk '{print $(NF-1)}' f2 f4 Statistics of /etc/passwd: file name, line number, number of columns per line, and corresponding complete line content: [root@localhost ~]# awk -F ':' '{print "filename:" FILENAME ",linenumber:" NR ",columns:" NF ",linecontent:"$0}' /etc/passwd filename:/etc/passwd,linenumber:1,columns:7,linecontent:root:x:0:0:root:/root:/bin/bash filename:/etc/passwd,linenumber:2,columns:7,linecontent:bin:x:1:1:bin:/bin:/sbin/nologin filename:/etc/passwd,linenumber:3,columns:7,linecontent:daemon:x:2:2:daemon:/sbin:/sbin/nologin Count the command line parameters ARGC, file line number FNR, field separator FS, number of fields in a record NF, number of records read (default is line number) NR in the /etc/passwd file [root@localhost ~]# awk -F: 'BEGIN{printf "%4s %4s %4s %4s %4s %4s\n","FILENAME","ARGC","FNR","FS","NF","NR";printf "---------------------------------------------\n"} {printf "%4s %4s %4s %4s %4s %4s\n",FILENAME,ARGC,FNR,FS,NF,NR}' /etc/passwd FILENAME ARGC FNR FS NF NR --------------------------------------------- /etc/passwd 2 1 : 7 1 /etc/passwd 2 2 : 7 2 /etc/passwd 2 3 : 7 3 6. Advanced usage of awk 1.awk assignment operation Assignment statement operators: = += -= *= /= %= ^= **= For example: a+=5 is equivalent to a=a+5 [root@localhost ~]# awk 'BEGIN{a=5;a+=5;print a}' 10 2.awk regular operation output contains the line of root, and prints the user name and UID and the original line content [root@localhost ~]# awk -F: '/root/ {print $1,$3,$0}' /etc/passwd root 0 root:x:0:0:root:/root:/bin/bash operator 11 operator:x:11:0:operator:/root:/sbin/nologin We found two lines. If we want to find the line starting with root, we need to write it like this: awk -F: '/^root/' /etc/passwd 3.awk ternary operation [root@localhost ~]# awk 'BEGIN{a="b";print a=="b"?"ok":"err"}' OK [root@localhost ~]# awk 'BEGIN{a="b";print a=="c"?"ok":"err"}' err The ternary operation is actually a judgment operation. If it is true, then output? If it is false, output: 4. Cyclic use of awk Use of if statement [root@localhost ~]# awk 'BEGIN{ test=100;if(test>90){ print "vear good";} else{print "no pass";}}' wear good Each command ends with ; [root@localhost ~]# awk 'BEGIN{test=100;num=0;while(i<=test){num+=i; i++;}print num;}' 5050 Use of for loop [root@localhost ~]# awk 'BEGIN{test=0;for(i=0;i<=100;i++){test+=i;}print test;}' 5050 Use of do loop [root@localhost ~]# awk 'BEGIN{test=0;i=0;do{test+=i;i++}while(i<=100)print test;}' 5050 5. Array application of awk Array is the soul of awk. The most important thing in text processing is its array processing. Because array indices (subscripts) can be numbers and strings, arrays in awk are called associative arrays. Arrays in awk do not need to be declared in advance, nor do they need to have their size specified. Array elements are initialized with 0 or the empty string, depending on the context. Generally speaking, arrays in awk are used to collect information from records, which can be used to calculate sums, count words, track the number of times a template is matched, and so on. awk -F: 'BEGIN {count=0;} {name[count] = $1;count++;}; END{for (i = 0; i < NR; i++) print i, name[i]}' /etc/passwd 0 root 1 bin 2 daemon 3 adm 4 lp 5 sync ........................................................................ 6. Application of awk string functions Function name description Examples: awk '{ sub(/test/, "mytest"); print }' testfile awk '{ sub(/test/, "mytest"); $1}; print }' testfile The first example matches the entire record, and the replacement occurs only at the first occurrence of a match. If you want to match the entire file, you need to use gsub The second example matches the first field in the entire record, and the replacement occurs only on the first match. Examples: awk '{ gsub(/test/, "mytest"); print }' testfile awk '{ gsub(/test/, "mytest" , $1) }; print }' testfile The first example matches test in the entire document, and all matches are replaced with mytest. The second example matches the first field in the entire document, and all matches are replaced with mytest. Examples: The example returns the position of test in mytest, and the result should be 3. Examples: The above example extracts the world substring. Examples: The above example splits the time by colon into the time array and displays the second array element 18. Examples: awk '{ print length( "test" ) }' awk '{ print length }' testfile The first example returns the length of the test string. The second example returns the number of characters in the record in the testfile file. match( string, regular expression ) Examples: awk '{start=match("this is a test",/[az]+$/); print start}' awk '{start=match("this is a test",/[az]+$/); print start, RSTART, RLENGTH }' The first example prints the starting position of the sequence ending with consecutive lowercase characters, which is 11 in this case. The second example also prints the RSTART and RLENGTH variables, which are 11(start), 11(RSTART), 4(RLENGTH). toupper( string ) Examples: You may also be interested in:
|
<<: vue-table implements adding and deleting
>>: MySQL 8.0.12 decompression version installation tutorial
1. Download and decompress MySQL 8.0.20 Download ...
The official document states: By injecting the ro...
Query the total size of all databases Here’s how:...
The test environment of this experiment: Windows ...
Five delay methods for MySQL time blind injection...
Overview Operations on any one database are autom...
The WeChat mini-program native components camera,...
One port changes In version 3.2.0, the namenode p...
The original code is this: <div class='con...
float:left/right/none; 1. Same level floating (1)...
The role of init_connect init_connect is usually ...
Table of contents Add Configuration json configur...
Table of contents npm download step (1) Import (2...
Table of contents 1. Preparation: 2. Source code ...
The installation of the rpm package is relatively...