awk Quickstart Guide

Introduction

awk is a powerful text processing tool for pattern scanning and processing. It's particularly useful for manipulating structured data and generating reports.

Basic Structure

pattern { action }
  • Works on input line by line
  • If pattern matches, execute action
  • Both pattern and action are optional

Common Use Cases

1. Print specific fields

awk '{print $1, $3}' file.txt  # Print 1st and 3rd columns

2. Filter lines

awk '$3 > 100' data.csv        # Print lines where 3rd field > 100
awk '/error/ {print $0}' log   # Print lines containing "error"

3. Calculations

awk '{sum += $1} END {print sum}' numbers.txt  # Sum first column

4. Formatted Output

awk '{printf "%-10s %5d\n", $1, $2}' data.txt

awk '{printf "%-10s %5d\n", $1, $2}' data.txt  # Formatted columns

5. Count Occurrences

awk '/404/ {count++} END {print count}' access.log  # Count 404 errors
awk '{count[$1]++} END {for (i in count) print i, count[i]}' data  # Frequency count

6. Field Manipulation

awk '{$2 = $2 * 1.1; print}' prices.txt  # Increase 2nd field by 10%
awk '{print $NF}' file.txt               # Print last field of each line

7. Multiple File Processing

awk 'FNR == 1 {print "Processing:", FILENAME} {print $0}' *.log

8. Conditional Logic

awk '{if ($2 >= 50) grade="Pass"; else grade="Fail"; print $1, grade}'

9. Text Transformation

awk '{gsub(/http:/, "https:")} 1' urls.txt  # Replace protocol
awk 'length($0) > 80' text.txt             # Find long lines

10. Processing Colon-Separated Files

# Sample /etc/passwd processing (fields are colon-separated)
awk -F ':' '{print "User:", $1, "Shell:", $7}' /etc/passwd

# Input example line:
# root:x:0:0:root:/root:/bin/bash

# Output:
# User: root Shell: /bin/bash

Built-in Variables

  • NR: Number of Records (current line number)
  • NF: Number of Fields in current record
  • FS: Field Separator (default: whitespace)
  • OFS: Output Field Separator
  • FILENAME: Current filename

Useful Functions

  • length(string): Get string length
  • substr(string, start, length): Get substring
  • tolower(string), toupper(string): Case conversion
  • split(string, array, separator): Split string into array

Special Patterns

  • BEGIN: Execute before processing input
  • END: Execute after processing input

Example:

BEGIN { 
  FS = ","
  print "Starting processing..."
}
{ print $1 }
END { print "Processing complete" }

Command Line Usage

# Common parameters:
awk -F ',' '{print $1}' file      # Set input field separator to comma
awk -v OFS='|' '{print $1,$2}'     # Set output field separator to pipe
awk -f script.awk input.txt       # Execute commands from script file
awk -i inplace.awk '{...}' file   # Edit files in-place (requires GNU awk)
awk --version                     # Show version information
awk --help                        # Display help information

# Execute script from file
awk -f script.awk input.txt

# One-liner
awk 'BEGIN {sum=0} {sum+=$1} END {print sum}' numbers.txt

# Use variables
awk -v var=10 '{print $1 + var}' data.txt

Added these common parameters:

  • -F to specify input field separator
  • -v for variable declaration
  • -f to read program from file
  • -i for in-place editing (GNU awk only)
  • --version and --help for version/help info

This covers basic AWK usage. For more advanced features, see man awk or the official documentation.