awk command in Linux
Introduction
awk
is pattern-directed scanning and processing language. It is useful for doing quick filtering of other commands.
This article will show you basic usage of awk
and how it can be integrated in daily workflow.
Grabbing column(s) from text
In this example we’ll grab column(s) from ps
command.
ps
command
ps
output has 4 columns.
❯ ps
PID TTY TIME CMD
608 ttys000 0:00.46 -zsh
613 ttys000 0:00.01 -zsh
797 ttys000 0:00.00 -zsh
798 ttys000 0:01.11 -zsh
Print one column
Let’s say, you want to print only PIDs from ps
output.
$1
below signifies position of column (Yes, numbering starts with 1). $0
prints all columns.
❯ ps | awk '{print $1}'
PID
608
613
797
798
Print multiple columns
We usually want to see PID and corresponding commnad side by side.
You can print multiple columns by adding more args to print
.
❯ ps | awk '{print $1, $4}'
PID CMD
608 -zsh
613 -zsh
797 -zsh
798 -zsh
Print last column
Nobody wants to count all columns to print last one. Luckily awk
has a way to do that.
❯ ps | awk '{print $NF}'
CMD
-zsh
-zsh
-zsh
-zsh
Field separator
By default awk
uses whitespace as separator. It can be changed using -F
arg.
A file separated by :
/etc/passwd
has columns separated by :
❯ cat /etc/passwd
root:x:0:0:root:/root:/usr/bin/zsh
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin
news:x:9:9:news:/var/spool/news:/usr/sbin/nologin
...
Get all users on system
From previous output we can see that first column contains username. So, print first column and use :
as a separator.
❯ awk -F ":" '{print $1}' /etc/passwd
root
daemon
bin
sys
sync
games
man
lp
mail
news
...
Add custom separator before printing
Let’s print user and it’s default shell but, add a sensible separator before printing. Having a single space as separator is hard to read. We’ll use \t
.
❯ awk 'BEGIN{FS=":"; OFS="\t"} {print $1, $NF}' /etc/passwd
root /usr/bin/zsh
daemon /usr/sbin/nologin
bin /usr/sbin/nologin
sys /usr/sbin/nologin
sync /bin/sync
games /usr/sbin/nologin
man /usr/sbin/nologin
lp /usr/sbin/nologin
mail /usr/sbin/nologin
news /usr/sbin/nologin
...
Pattern matching
Pattern should be between two /
s (like /<pattern>/
).
Print all usernames which start with letter d
and their respective shells.
❯ awk -F ":" '/^d/ {print $1, $NF}' /etc/passwd
daemon /usr/sbin/nologin
dnsmasq /usr/sbin/nologin
dhananjay /usr/bin/zsh
Some more goodies
BEGIN
and END
rules
A BEGIN rule is executed once before any text processing starts. In fact, it’s executed before awk even reads any text.
An END rule is executed after all processing has completed. You can have multiple BEGIN and END rules, and they’ll execute in order.
❯ awk -F ":" 'BEGIN {print "User shells\n============"} {print $1 "\t" $NF}' /etc/passwd | head
User shells
============
root /usr/bin/zsh
daemon /usr/sbin/nologin
bin /usr/sbin/nologin
sys /usr/sbin/nologin
sync /bin/sync
games /usr/sbin/nologin
man /usr/sbin/nologin
lp /usr/sbin/nologin
Arithmatic
❯ df -h | awk '{print $2 "\t" $3+$4}'
Size 0
3.9G 3.9
793M 793.5
689G 661
3.9G 3.9
5.0M 5
3.9G 3.9
Lenght of string
❯ df -h | awk 'length($1) > 5'
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 689G 132G 529G 20% /
/dev/loop3 56M 56M 0 100% /snap/core18/2066
/dev/loop5 163M 163M 0 100% /snap/gnome-3-28-1804/145
/dev/loop7 219M 219M 0 100% /snap/gnome-3-34-1804/66
...
if
condition
Print all PIDs owned by your user.
❯ ps aux | awk '{ if($1 == "dhananj+") print $2}' | head
2283
2292
...
Some built-in functions
- Square root
❯ awk 'BEGIN { print sqrt(25)}'
- arctangent
❯ awk 'BEGIN {print atan2(0, -1)}'