AWK Syntax Essentials

Syntax is based on The AWK Programming Language, 2nd Edition by Alfred V. Aho, Brian W. Kernighan, and Peter J. Weinberger.
Pattern
A pattern determines when an action is executed. When a pattern matches an input line, its associated action is executed.
If no action is specified, the default is print $0.
Syntax: pattern { action }
Examples
BEGIN— runs once before any input is read
BEGIN { FS=":" }
END— runs once after all input is processed
END { print "total:", total }
Expression — executes on every input line where the condition is true
Skip the header line
NR > 1 { print $0 }
- Regex — executes on every line matching the pattern
/error/ { print $0 }
- Range — matches all lines from
pattern1throughpattern2, inclusive
/start/,/end/ { print $0 }
Conditionals
AWK supports standard conditionals for branching logic.
Examples
ifSkip empty lines
if (NF > 0) print $0
if-else
if ($1 > 0) print "positive"; else print "negative"
if-else if
if ($1 > 0) print "positive"
else if ($1 < 0) print "negative"
else print "zero"
Ternary Expression
The ternary operator (?:) is a compact, C-style alternative to the if-else statement. By using this operator, a concise one-line ternary expression can be constructed, which, unlike a statement, returns a value that can be used directly within calculations or commands.
Syntax: condition ? value_if_true : value_if_false
Examples
- Absolute Value
AWK lacks a built-in abs() function. The ternary operator handles this math logic efficiently:
$1 = ($1 < 0) ? -$1 : $1
- Truthiness & Success Labels
AWK treats 0 and "" as false, and everything else as true. Use this to label status codes:
print $1, ($1 ? "SUCCESS" : "FAILURE")
or
printf("%s\t%s\n", $1, ($1 ? "SUCCESS" : "FAILURE"))
Associative Arrays
AWK arrays are associative — keys can be strings or numbers, making them ideal for counting, grouping, and lookups without any pre-declaration.
Syntax: array[key] = value
Examples
- Populate an array with lines
x[NR] = $0
- Deduplicate lines
!x[$0]++
- Populate a 2D matrix with all fields
for (i=1; i<=NF; i++) x[NR, i] = $i
AWK simulates 2D arrays by concatenating keys with a built-in separator (SUBSEP), so x[row, col] is stored internally as x[row SUBSEP col].
Loops
AWK supports standard C-style loops as well as a dedicated form for traversing associative arrays.
Examples
forloop
for (i=1; i<=NF; i++) print $i
whileloop
while (i <= NF) { print $i; i++ }
do-whileloop
do { print $i; i++ } while (i <= NF)
- Iterate over an associative array
for (k in array) print k, array[k]
Pipes
AWK can send output to, or receive input from, external shell commands using pipes, enabling seamless integration with standard Unix tools.
Syntax: command | getline [var] / print | "command"
Examples
- Send output to a shell command
print $1 | "sort"
- Read a shell command's output into a variable
"date" | getline today
- Pipe to multiple commands (close between uses)
print $1 | "sort | uniq -c"
Comparison Operators
| Operator | Meaning |
< | less than |
<= | less than or equal to |
== | equal to |
!= | not equal to |
>= | greater than or equal to |
> | greater than |
~ | matched by |
!~ | not matched by |
Logical Operators
| Operator | Meaning |
&& | AND |
││ | OR |
! | NOT |
Syntax: condition1 operator condition2
Built-in variables
| Variable | Description | Default |
ARGC | Number of command-line arguments, including command name | - |
ARGV | Array of command-line arguments, numbered 0..ARGC-1 | - |
CONVFMT | Conversion format for numbers | "%.6g" |
ENVIRON | Array of shell environment variables | - |
FILENAME | Name of current input file | - |
FNR | Record number in current file | - |
FS | Input field separator | " " |
NF | Number of fields in current record | - |
NR | Number of records read so far | - |
OFMT | Output format for numbers | "%.6g" |
OFS | Output field separator for print | " " |
ORS | Output record separator for print | "\n" |
RLENGTH | Length of string matched by match function | - |
RS | Input record separator | "\n" |
RSTART | Start of string matched by match function | - |
SUBSEP | Subscript separator | "\034" |
Built-in Arithmetic Functions
| Function | Value Returned |
atan2(y, x) | arctangent of y/x in the range −π to π |
cos(x) | cosine of x, with x in radians |
exp(x) | exponential function of x, e^x |
int(x) | integer part of x; truncated towards 0 |
log(x) | natural (base e) logarithm of x |
rand() | random number r, where 0 ≤ r < 1 |
sin(x) | sine of x, with x in radians |
sqrt(x) | square root of x |
srand(x) | x is new seed for rand(); use time of day if x is omitted; return previous seed |
Built-in String Functions
| Function | Description |
gsub(r,s) | substitute s for r globally in $0, return number of substitutions made |
gsub(r,s,t) | substitute s for r globally in string t, return number of substitutions made |
index(s,t) | return first position of string t in s, or 0 if t is not present |
length(s) | return number of Unicode characters in s; return number of elements if s is an array |
match(s,r) | test whether s contains a substring matched by r; return index or 0; sets RSTART and RLENGTH |
split(s,a) | split s into array a on FS or as CSV if --csv is set, return number of elements in a |
split(s,a,fs) | split s into array a on field separator fs, return number of elements in a |
sprintf(fmt,expr-list) | return expr-list formatted according to format string fmt |
sub(r,s) | substitute s for the leftmost longest substring of $0 matched by r; return number of substitutions made |
sub(r,s,t) | substitute s for the leftmost longest substring of t matched by r; return number of substitutions made |
substr(s,p) | return suffix of s starting at position p |
substr(s,p,n) | return substring of s of length at most n starting at position p |
tolower(s) | return s with upper case ASCII letters mapped to lower case |
toupper(s) | return s with lower case ASCII letters mapped to upper case |
Expression Operators
| Operation | Operators | Example | Meaning of Example |
| assignment | = += -= *= /= %= ^= | x *= 2 | x = x * 2 |
| conditional | ?: | x ? y : z | if x is true then y else z |
| logical OR | ││ | x ││ y | 1 if x or y is true, 0 otherwise |
| logical AND | && | x && y | 1 if x and y are true, 0 otherwise |
| array membership | in | i in a | 1 if a[i] exists, 0 otherwise |
| matching | ~ !~ | $1 ~ /x/ | 1 if the first field contains an x, 0 otherwise |
| relational | < <= == != >= > | x == y | 1 if x is equal to y, 0 otherwise |
| concatenation | (none) | "a" "bc" | "abc"; there is no explicit concatenation operator |
| add, subtract | + - | x + y | sum of x and y |
| multiply, divide, mod | * / % | x % y | remainder of x divided by y |
| unary plus and minus | + - | -x | negated value of x |
| logical NOT | ! | !$1 | 1 if $1 is zero or null, 0 otherwise |
| exponentiation | ^ | x ^ y | x to the power y |
| increment, decrement | ++ -- | ++x, x++ | add 1 to x |
| field | $ | $i + 1 | value of i-th field, plus 1 |
| grouping | () | $(i++) | return i-th field, then increment i |
printf
printf format-control characters
| Character | Print Expression As |
c | single UTF-8 character (code point) |
d or i | decimal integer |
e or E | [-]d.dddddde[+-]dd or [-]d.ddddddE[+-]dd |
f | [-]ddd.dddddd |
g or G | e or f conversion, whichever is shorter, with nonsignificant zeros suppressed |
o | unsigned octal number |
u | unsigned integer |
s | string |
x or X | unsigned hexadecimal number |
% | print a %; no argument is consumed |






