Unix Text Processing Tools Quickstart Guide
Basic File Viewing
head
- View beginning of files
head file.txt # Show first 10 lines
head -n 5 file.txt # Show first 5 lines
head -c 20 file.txt # Show first 20 bytes
tail
- View end of files
tail file.txt # Show last 10 lines
tail -n 15 file.txt # Show last 15 lines
tail -f logfile.log # Follow (watch) file in real-time
File Contents Manipulation
cat
- Concatenate and display files
cat file.txt # Display entire file
cat -n file.txt # Show line numbers
cat file1 file2 > combined.txt # Combine files
less
/more
- Pager programs
less largefile.log # Scroll with arrows/page up-down (q to quit)
more largefile.log # Basic pager (space for next page)
Key differences:
Feature | less |
more |
---|---|---|
Navigation | Bidirectional (↑/↓, PgUp/PgDn) | Forward-only (spacebar) |
Exit behavior | Stays open after reaching EOF | Auto-exits at EOF |
Search | Supports regex search (/) | Basic search |
File modification | Can follow growing files | Static view |
Memory usage | More efficient with large files | Simpler implementation |
Scrolling | Percentage shown, line numbers | Basic line count |
When to use which:
- Use
less
for most interactive viewing (modern default) - Use
more
for simple forward-only viewing - Both support:
q
to quit,/
to search (but less has better search)
Text Processing
wc
- Word count
wc file.txt # Lines, words, characters count
wc -l file.txt # Count lines only
grep
- Pattern searching
grep "error" log.txt # Search for 'error'
grep -i "warning" log.txt # Case-insensitive search
grep -v "debug" log.txt # Invert match (exclude lines)
grep -r "pattern" directory/ # Recursive search
sort
- Sort lines
sort file.txt # Alphabetical sort
sort -n data.txt # Numerical sort
sort -u file.txt # Unique sort (remove duplicates)
uniq
- Report/omit repeated lines
uniq file.txt # Remove consecutive duplicates
uniq -c file.txt # Count occurrences
cut
- Remove sections from lines
cut -d',' -f2 data.csv # Extract second column using comma delimiter
cut -c1-5 file.txt # Extract first 5 characters
awk
- Pattern scanning/processing
awk '{print $1}' file.txt # Print first column
awk -F: '{print $3}' /etc/passwd # Split on colon
awk 'NR > 5 && NR < 10' file.txt # Show lines 6-9
Combining Commands (Pipes)
# Common pipeline example:
grep "ERROR" log.txt | cut -d' ' -f3- | sort | uniq -c | head -n 20
# Breakdown:
# 1. Find lines with "ERROR"
# 2. Extract from 3rd field to end
# 3. Sort results
# 4. Count unique errors
# 5. Show top 20
Tips & Tricks
- Combine head/tail:
head -n 20 file.txt | tail -n 5
shows lines 16-20 - Monitor growing file:
tail -f access.log | grep "404"
- Count CSV rows:
wc -l data.csv
- Find unique IPs in logs:
cut -d' ' -f1 access.log | sort -u
- Sum numbers in column:
awk '{sum+=$3} END {print sum}' data.txt