In-depth on commands

Here are several Linux commands that are especially useful for processing text directly from the command line. These tools can be combined (often using pipes) to perform complex text-manipulation tasks.

Basic File Viewing and Counting

cat
Display the entire content of a file.
Example:
cat file.txt
head
Show the first few lines of a file (default is 10).
Example:
head -n 10 file.txt
tail
Show the last few lines of a file. Useful with the -f option for live updates.
Example:
tail -n 10 file.txt tail -f logfile.txt
wc
Count lines, words, and characters in a file.
Example:
wc -l file.txt # Count only lines

Searching and Filtering

grep
Search for patterns in files using regular expressions.
Example:
grep 'pattern' file.txt grep -R 'pattern' /path/to/directory # Recursive search
egrep/grep -E
Use extended regular expressions for more complex pattern matching.
Example:
egrep 'pattern1|pattern2' file.txt

Editing and Transforming

sed
A stream editor for filtering and transforming text. Commonly used for substitutions.
Example:
sed 's/old/new/g' file.txt # Replace 'old' with 'new' on every line
awk
A powerful programming language for text processing, ideal for pattern scanning and reporting.
Example:
awk '{print $1}' file.txt # Print the first field (column) of each line awk -F, '{print $2}' file.txt # Use comma as field separator and print second field
cut
Remove sections from each line of files by specifying a delimiter and field.
Example:
cut -d',' -f1 file.txt # Extract the first comma-separated field
tr
Translate or delete characters.
Example:
tr 'a-z' 'A-Z' < file.txt # Convert all lowercase letters to uppercase tr -d '\r' < file.txt # Remove carriage return characters

Sorting, Merging, and Comparing

sort
Sort lines in text files.
Example:
sort file.txt sort -r file.txt # Reverse sort
uniq
Filter out or report repeated lines (typically used after sort to work correctly).
Example:
sort file.txt | uniq uniq -c file.txt # Pre-count repeated lines in a sorted file
paste
Merge lines of files side-by-side.
Example:
paste file1.txt file2.txt
join
Join lines of two files on a common field (files must be sorted on the join field).
Example:
join file1.txt file2.txt
comm
Compare two sorted files line by line and display common or unique lines.
Example:
comm file1.txt file2.txt
diff
Compare files line by line and output the differences.
Example:
diff file1.txt file2.txt

Formatting and Building Command Pipelines

fmt
Reformat text paragraphs to a desired width, making it easier to read.
Example:
fmt -w 80 file.txt # Wrap text to 80 characters per line
xargs
Build and execute command lines from standard input. Useful for processing lists of items.
Example:
find . -name "*.txt" | xargs grep 'pattern'

Additional Tips

Combining Commands with Pipes (|):
You can chain these commands together to create powerful text-processing pipelines. For example:
cat file.txt | grep 'error' | sort | uniq -c | sort -nr
This pipeline shows error messages sorted by frequency.
Scripting:
For more complex tasks, consider writing a shell script or using higher-level languages like Python or Perl directly from the command line.

These commands form a toolkit that, when combined effectively, can help you perform a wide range of text processing and data manipulation tasks in Linux. Experiment with them to become more comfortable with their syntax and capabilities!

Last modified: 08 February 2025